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NOVEL NUCLEIC ACIDS AND SECRETED 
POLYPEPTIDES 

1. CROSS REFERENCE TO RELATED APPLICATIONS 

5 This application is a continuation-in-part application of U.S. Application Serial No. 

09/552,317 filed April 25, 2000 entitled "Novel Contigs Obtained from Various Libraries", 
Attorney Docket No. 784CIP, which in turn is a continuation-in-part application of U.S. 
Application Serial No. 09/488,725 filed, January 21, 2000 entitled "Novel Contigs Obtained 
from Various Libraries", Attorney Docket No. 784; U.S. Application Serial No. 09/491,404 

10 filed January 25, 2000 entitled ''Novel Contigs Obtained from Various Libraries':, Attorney 
Docket No. 785; U.S. Application Serial No. 09/560,875 filed April 27, 2000 entitled "Novel 
Contigs Obtained from Various Libraries", Attorney Docket No. 787CDP, which in ton is a 
continuation-in-part application of U.S. Application Serial No. 09/496,914 filed February 03, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 787; 

15 U.S. Application Serial No. 09/577,409 filed May 18, 2000 entitled tc Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 788CBP, which in turn is.a 
continuation-in-part application of U.S. Application Serial No. 09/515,126 filed February 28, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 788; 
U.S. Application Serial No. 09/574,454 filed May 19, 2000 entitled "Novel Contigs 

20 Obtained from Various Libraries", Attorney Docket No. 789CIP which in turn is a 

continuation-in-part application of U.S. Application Serial No. 09/519,705 filed March 07, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 789; 
U.S. Application Serial No. 09/649,167 filed August 23, 2000 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 790CIP , which in turn is a 

25 continuation-in-part application of U.S. Application Serial No. 09/540,217 filed March 31, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 790; 
U.S. Application Serial No. 09/770, 160 filed January 26, 2001 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 791CIP, which is in turn a 
continuation-in-part application of U.S. Application SerialNo. 09/552,929 filed April 18, 

30 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 791; 
and U.S. Application Serial No. 09/577,408 filed May 18, 2000 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 792; all of which are incorporated 
herein by reference in their entirety. 
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2. BACKGROUND OF THE INVENTION 

2.1 TECHNICAL FIELD 

5 The present invention provides novel polynucleotides and proteins encoded by such 

polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2.2 BACKGROUND 

10 Technology aimed at the discovery of protein factors (including e.g., cytokines, such 

as lymphokines, interferons, circulating soluble factors, chemokines, and interleukins) has 
matured rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid sequence 

15 of the protein in the case of hybridization cloning; activity of the protein in the case of 
expression cloning). More recent "indirect" cloning techniques such as signal sequence 
cloning, which isolates DNA sequences based on the presence of a now well-recognized 
secretory leader sequence motif, as well as various PCR-based or low stringency 
hybridization-based cloning techniques, have advanced the state of the art by making 

20 available large numbers of DNA/amino acid sequences for proteins that are known to have 
biological activity, for example, by virtue of their secreted nature in the case of leader 
sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 
Identified polynucleotide and polypeptide sequences have numerous applications in, 

25 for example, diagnostics, forensics, gene mapping; identification of mutations responsible 
for genetic disorders or other traits, to assess biodiversity, and to produce many other types 
of data and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

3 0 The compositions of the present invention include novel isolated polypeptides, novel 

isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
cloned genes or degenerate variants thereof especially naturally occurring variants such as 
allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize 



Printed from Mimosa 05/1 1/28 15:16:24 Page: 3 



WO 03/080795 



PCT/US02/25485 



3 

one or more epitopes present on such polypeptides, as well as hybridomas producing such 
antibodies. 

The compositions of the present invention additionally include vectors, including 
expression vectors, containing the polynucleotides of the invention, cells genetically engineered 
5 to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and m some cases, sequences obtained from one or more public 

1 0 databases. The invention relates also to the proteins encoded by such polynucleotides, along 
with therapeutic, diagnostic and research utilities for these polynucleotides and proteins. These 
nucleic acid sequences are designated as SEQ ID NO: 1-1041 , or 2083-2534 and are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenine; C 
is cytosine; G is guanine; T is thymine; and N is any of the four bases or unknown. In the 

15 amino acids provided in the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences 
that hybridize to the complement of SEQ ID NO: 1-1041, or 2083-2534 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 

20 encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ 
ID NO: 1 -1 041 , or 2083-2534. A polynucleotide comprising a nucleotide sequence having at 
least 90% identity to an identifying sequence of SEQ ID NO: 1-1041, or 2083-2534 or a 
degenerate variant or fragment thereof. The identifying sequence can be 1 00 base pairs in 
length. 

25 The nucleic acid sequences of the present invention also include the sequence 

information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534. The 
sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-2534 that 
uniquely identifies or represents the sequence information of SEQ ID NO: 1-1041, or 2083- 
2534. 

30 A collection as used in this application can be a collection of only one polynucleotide. 

The collection of sequence information or identifying information of each sequence can be 
provided on a nucleic acid array. In one embodiment, segments of sequence information are 
provided on a nucleic acid array to detect the polynucleotide that contains the segment. Hie 
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array can be designed to detect full-match or mismatch to the polynucleotide that contains the 
segment. The collection can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; 
5 and host cells or organisms transformed with these expression vectors. Nucleic acid sequences 
(or their reverse or direct complements) according to the invention have numerous applications 
in a variety of techniques known to those skilled in the art of molecular biology, such as use as 
hybridization probes, use as primers for PCR, use in an array, use in computer-readable media, 
use in sequencing full-length genes, use for chromosome and gene mapping, use in the 
1 0 recombinant production of protein, and use in the generation of anti-sense DNA or RNA, their 
chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083- 
2534 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art. In a particularly preferred embodiment, the 
15 nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534 or novel segments or parts of the 
nucleic acids provided herein are used in diagnostics for identifying expressed genes or, as well 
known in the art and exemplified by Vollrath et aL, Science 258:52-59 (1992), as expressed 
sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
20 polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
1041, or 2083-2534; a polynucleotide comprising any of the full length protein coding 
sequences of SEQ ID NO: 1-1041, or 2083-2534; and a polynucleotide comprising any of the 
nucleotide sequences of the mature protein coding sequences of SEQ ID NO: 1-1041, or 2083- 
2534. The polynucleotides of the present invention also include, but are not limited to, a 
25 polynucleotide that hybridizes under stringent hybridization conditions to (a) the complement of 
any one of the nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534; (b) a 
nucleotide sequence encoding any one of the amino acid sequences set forth in SEQ ID NO: 1- 
1041, or 2083-2534; (c) a polynucleotide which is an allelic variant of any polynucleotides 
recited above; (d) a polynucleotide which encodes a species homolog (e.g. orthologs) of any of 
30 the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 

specific domain or truncation of any of the polypeptides comprising an amino acid sequence set 
forth in SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8. 
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The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides having 
5 a nucleotide sequence set forth in SEQ ID NO: 1-1041, or 2083-2534; or (b) polynucleotides 
that hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically active variants of any of the polypeptide sequences in the Sequence 
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 
85%, 90%, 95%o, 98% or 99% amino acid sequence identity) that preferably retain biological 
1 0 activity are also contemplated The polypeptides of the invention may be wholly or partially 
chemically synthesized but are preferably produced by recombinant means using the genetically 
engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such 
15 as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

Hie invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
20 under conditions permitting expression of the desired polypeptide, and purifying the 

polypeptide from the culture or from the host cells. Preferred embodiments include those in 
which the protein produced by such processes is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology. These techniques 
25 include use as hybridization probes, use as oligomers, or primers, for PCR, use for 

chromosome and gene mapping, use in the recombinant production of protein, and use in 
generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, 
when the expression of an mRNA is largely restricted to a particular cell or tissue type, 
polynucleotides of the invention can be used as hybridization probes to detect the presence 
30 of the particular cell or tissue mRNA in a sample using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
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exemplified by Vollrath et aL, Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a 
5 polypeptide of the invention can be used to generate an antibody that specifically binds the 
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or 
quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as 
molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 

1 0 condition which comprises the step of administering to a mammalian subject a 

therapeutically effective amount of a composition comprising a polypeptide of the present 
invention and apharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, 
for example, in methods for the prevention and/or treatment of disorders involving aberrant 

1 5 protein expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 
example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited 
herein and for the identification of subjects exhibiting a predisposition to such conditions. 

20 The invention provides a method for detecting the polynucleotides of the invention in a 
sample, comprising contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of interest for a period sufficient to form the complex and 
under conditions sufficient to form a complex and detecting the complex such that if a 
complex is detected, the polynucleotide of interest is detected. The invention also provides a 

25 method for detecting the polypeptides of the invention in a sample comprising contacting the 
sample with a compound that binds to and forms a complex with the polypeptide under 
conditions and for a period sufficient to form the complex and detecting the formation of the 
complex such that if a complex is formed, the polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 

30 monoclonal antibodies, and optionally quantitative standards, for carrying out methods of the 
invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, 
and monitoring the progress of patients, involved in clinical trials for the treatment of 
disorders as recited above. 
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The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or 
polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 
5 Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of the 
invention comprising contacting the compound with a polypeptide of the invention in a cell 
for a time sufficient to form a polypeptide/compound complex, wherein the complex drives 

1 0 expression of a reporter gene sequence in the cell; and detecting the complex by detecting 
the reporter gene sequence expression such that if expression of the reporter gene is detected 
the compound that binds to a polypeptide of the invention is identified. 

The methods of the invention also provide methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals 

1 5 exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
other substances that modulate the overall activity of the target gene products. Compounds 
and other substances can affect such modulation either on the level of target gene/protein 
expression or target protein activity. 

20 The polypeptides of the present invention and the polynucleotides encoding them are 

also useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family 
(as set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides 

25 and polynucleotides of the present invention are useful for a variety of applications, as 
described herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 

30 4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms 
"a", "an" and "the" include plural references unless the context clearly dictates otherwise. 
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The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 

5 Likewise "immunologically active" or "immunological activity" refers to the capability of 
the natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 

1 0 secretory or enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 
molecules may be "partial" such that only certain portion(s) of the nucleic acids bind or it 

1 5 may be "complete" such that total complementarity exists between the single stranded 

molecules. The degree of complementarity between the nucleic acid strands has significant 
effects on the efficiency and strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ 

20 line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a 
steady and continuous source of germ cells for the production of gametes. The term 
"primordial germ cells (PGCs)" refers to a small population of cells set aside from other cell 
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during embryogenesis 
that have the potential to differentiate into germ cells and other cells. PGCs are the source 

25 from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells are 

capable of self-renewal. Thus these cells not only populate the germ line and give rise to a 
plurality of terminally differentiated cells that comprise the adult specialized organs, but are 
able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 

30 which modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. 
EMFs include, but are not limited to, promoters, and promoter modulating sequences 
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(inducible elements). One class of EMFs are nucleic acid fragments which induce the 
expression of an operably linked ORF in response to a specific regulatory factor or 
physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
5 "oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the 
sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like 
material. In the sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and 
10 N is A, C, G, or T (U) or unknown. It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 
oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid which is 
1 5 capable of being expressed in a recombinant transcriptional unit comprising regulatory 
elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
nucleotide residues which are at least about 5 nucleotides, more preferably at least about 7 
20 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 1 1 
nucleotides and most preferably at least about 17 nucleotides. The fragment is preferably 
less than about 500 nucleotides, preferably less than about 200 nucleotides, more preferably 
less than about 100 nucleotides, more preferably less than about 50 nucleotides and most 
preferably less than 30 nucleotides. Preferably the probe is from about 6 nucleotides to 
25 about 200 nucleotides, preferably from about 1 5 to about 50 nucleotides, more preferably 
from about 17 to 30 nucleotides and most preferably from about 20 to 25 nucleotides. 
Preferably the fragments can be used in polymerase chain reaction (PCR), various 
hybridization procedures or microarray procedures to identify or amplify identical or related 
parts of mRNA or DNA molecules. A fragment or segment may uniquely identify each 
30 polynucleotide sequence of the present invention. Preferably the fragment comprises a 
sequence substantially similar to any one of SEQ ID NO: 1-1041, or 2083-2534. 

Probes may, for example, be used to determine whether specific mRNA molecules 
are present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal 
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DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). 
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods 
well known in the art. Probes of the present invention, their preparation and/or labeling are 
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold 

5 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in 
Molecular Biology, John Wiley & Sons, New York NY, both of which are incorporated 
herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534. The 

10 sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-2534 
that uniquely identifies or represents the sequence information of that sequence of SEQ ID 
NO: 1-1041, or 2083-2534, or those segments identified in Tables 3, 5, 6, and 8. One such 
segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 

15 billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there 
are 300 times more twenty-mers than there are base pairs in a set of human chromosomes. 
Using the same analysis, the probability for a seventeen-mer to be fully matched in the 
human genome is approximately 1 in 5. When these segments are used in arrays for 
expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 

20 fully matched in the expressed sequences is also approximately one in five because 

expressed sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment 
can be a twenty-five mer. The probability that the twenty-five mer would appear in a human 
genome with a single mismatch is calculated by multiplying the probability for a full match 

25 (l-s-4 25 ) times the increased probability for mismatch at each nucleotide position (3 x 25). The 
probability that an eighteen mer with a single mismatch can be detected in an array for 
expression studies is approximately one in five. The probability that a twenty-mer with a single 
mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

30 amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably linked 
with a coding sequence if the promoter controls the transcription of the coding sequence. 
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While operably linked nucleic acid sequences can be contiguous and in the same reading 
frame, certain genetic elements e.g. repressor genes are not contiguously linked to the coding 
sequence but still control transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number 
5 of differentiated cell types that are present in an adult organism. A pluripotent cell is 
restricted in its differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to naturally 
occurring or synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a 
10 stretch of amino acid residues of at least about 5 amino acids, preferably at least about 7 
amino acids, more preferably at least about 9 amino acids and most preferably at least about 
17 or more amino acids. The peptide preferably is not greater than about 200 amino acids, 
more preferably less than 150 amino acids and most preferably less than 100 amino acids. 
Preferably the peptide is from about 5 to about 200 amino acids. To be active, any 
15 polypeptide must have sufficient length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells 
that have not been genetically engineered and specifically contemplates various polypeptides 
arising from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipi'dation and acylation. 
20 The term "translated protein coding portion" means a sequence which encodes for the 

full-length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
peptide or protein without a signal or leader sequence. The "mature protein portion" means 
that portion of the protein which does not include a signal or leader sequence. The peptide 
25 may have been produced by processing in the cell which removes any leader/signal 

sequence. The mature protein portion may or may not include the initial methionine residue. 
The methionine residue may be removed from the protein during processing in the cell. The 
peptide may be produced synthetically or the protein may have been produced using a 
polynucleotide only encoding for the mature protein coding sequence. 
30 The term "derivative" refers to polypeptides chemically modified by such techniques 

as ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
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substitution by chemical synthesis of amino acids such as ornithine, which do not nomially 
occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions,-deletions, and substitutions, created using, 
5 e g., recombinant DNA techniques. Guidance in determining which amino acid residues 
may be replaced, added or deleted without abolishing activities of interest, may be found by 
comparing the sequence of the particular polypeptide with that of homologous peptides and 
minimizing the number of amino acid sequence changes made in regions of high homology 
(conserved regions) or by replacing amino acids with consensus sequence. 

10 Alternatively, recombinant variants encoding these same or similar polypeptides may 

be synthesized or selected by making use of the "redundancy" in the genetic code. Various 
codon substitutions, such as the silent changes which produce various restriction sites, may 
be introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be 

1 5 reflected in the polypeptide or domains of other peptides added to the polypeptide to modify 
the properties of any part of the polypeptide, to change characteristics such as ligand-binding 
affinities, interchain affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative 

20 amino acid replacements. "Conservative" amino acid substitutions may be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino 
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and 
methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, 

25 asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, 
and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic 
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20 amino acids, 
more preferably 1 to 10 amino acids. The variation allowed may be experimentally 
determined by systematically making insertions, deletions, or substitutions of amino acids in 

30 a polypeptide molecule using recombinant DNA techniques and assaying the resulting 
recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
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alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. Further, such alterations can be selected so as to generate 
5 polypeptides that are better suited for expression, scale up and the like in the host cells 
chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g, polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, 
more preferably at least 99% by weight, of the indicated biological macromolecules present 
(but water, buffers, and other small molecules, especially molecules having a molecular 
weight of less than 1000 daltons, can be present). 

15 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated 

from at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic 
acid or polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide 
is found in the presence of (if anything) only a solvent, buffer, ion, or other component 
normally present in a solution of the same. The terms "isolated" and "purified" do not 

20 encompass nucleic acids or polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or 
mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins 
made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant 

25 microbial" defines a polypeptide or protein essentially free of native endogenous substances 
and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed 
in most bacterial cultures, e.g, E. coli, will be free of glycosylation modifications; 
polypeptides or proteins expressed in yeast will have a glycosylation pattern in general 
different from those expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or 

virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression 
vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic element 
or elements having a regulatory role in gene expression, for example, promoters or 
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enhancers, (2) a structural or coding sequence which is transcribed into mRNA and 
translated into protein, and (3) appropriate transcription initiation and termination sequences. 
Structural units intended for use in yeast or eukaryotic expression systems preferably include 
a leader sequence enabling extracellular secretion of translated protein by a host cell. 
5 Alternatively, where recombinant protein is expressed without a leader or transport 

sequence, it may include an amino terminal methionine residue. This residue may or may 
not be subsequently cleaved from the expressed recombinant protein to provide a final 
product. 

The term "recombinant expression system" means host cells which have stably 
1 0 integrated a recombinant transcriptional unit into chromosomal DNA or carry the 

recombinant transcriptional unit extrachromosomally. Recombinant expression systems as 
defined herein will express heterologous polypeptides or proteins upon induction of the 
regulatory elements linked to the DNA segment or synthetic gene to be expressed. This term 
also means host cells which have stably integrated a recombinant genetic element or 

15 elements having a regulatory role in gene expression, for example, promoters or enhancers. 
Recombinant expression systems as defined herein will express polypeptides or proteins 
endogenous to the cell upon induction of the regulatory elements linked to the endogenous 
DNA segment or gene to be expressed. The cells can be prokaryotic or eukaryotic. 
The term "secreted" includes a protein that is transported across or through a 

20 membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell in 
which they are expressed. "Secreted" proteins also include without limitation proteins that 
are transported across the membrane of the endoplasmic reticulum. "Secreted" proteins are 

25 also intended to include proteins containing non-typical signal sequences (e.g. Interleukin-1 
Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors 
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. 
(1998) Annu. Rev. Immunol. 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 

30 sequence" which will direct the polypeptide through the membrane of a cell. Such a 

sequence may be naturally present on the polypeptides of the present invention or provided 
from heterologous protein sources by recombinant DNA techniques. 
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The term "stringent" is used to refer to conditions that are commonly understood in 
the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 
hybridization to filter-bound DNA in 0.5 M NaHPC>4, 7% sodium dodecyl sulfate (SDS), 1 
mM EDTA at 65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent 
5 conditions (i.e., washing in 0.2X SSC/0. 1 % SDS at 42°C). Other exemplary hybridization 
conditions are described herein in the examples. 

In instances of hybridization of deoxyoligonucleotides, additional exemplary 
stringent hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate 
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligonucleotides), 55°C (for 20- 
10 base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both to 
nucleotide and amino acid sequences, for example a mutant sequence, that varies from a 
reference sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between the reference and 
1 5 subject sequences. Typically, such a substantially equivalent sequence varies from one of 
those listed herein by no more than about 35% (Le., the number of individual residue 
substitutions, additions, and/or deletions in a substantially equivalent sequence, as compared 
to the corresponding reference sequence, divided by the total number of residues in the 
substantially equivalent sequence is about 0.35 or less). Such a sequence is said to have 
20 65% sequence identity to the listed sequence. In one embodiment, a substantially 

equivalent, e.g., mutant, sequence of the invention varies from a listed sequence by no more 
than 30% (70% sequence identity); in a variation of this embodiment, by no more than 25% 
(75% sequence identity); and in a further variation of this embodiment, by no more than 
20% (80% sequence identity) and in a further variation of this embodiment, by no more than 
25 10% (90% sequence identity) and in a further variation of this embodiment, by no more that 
5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid sequences 
according to the invention preferably have at least 80% sequence identity with a listed amino 
acid sequence, more preferably at least 85% sequence identity, more preferably at least 90% 
sequence identity, more preferably at least 95% sequence identity, more preferably at least 
30 98% sequence identity, and most preferably at least 99% sequence identity. Substantially 
equivalent nucleotide sequence of the invention can have lower percent sequence identities, 
taking into account, for example, the redundancy or degeneracy of the genetic code. 
Preferably, the nucleotide sequence has at least about 65% identity, more preferably at least 
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about 75% identity, more preferably at least about 80% sequence identity, more preferably at 
least 85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least about 95% sequence identity, more preferably at least 98% sequence identity, and 
most preferably at least 99% sequence identity. For the purposes of the present invention, 

5 sequences having substantially equivalent biological activity and substantially equivalent 
expression characteristics are considered substantially equivalent. For the purposes of 
determining equivalence, truncation of the mature sequence (e.g., via a mutation which 
creates a new stop codon) should be disregarded. Sequence identity may be determined, 
e.g., using the Jotun Hein method (Hem, J. (1990) Methods Enzymol. 183:626-645). 

10 Identity between sequences can also be determined by other methods known in the art, e.g. 
by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the 

cell types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that 
1 5 the DNA is replicable, either as an extrachromosomal element, or by chromosomal 

integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 
virus or viral vector. 

20 As used herein, an "uptake modulating fragment," UMF, means a series of 

nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can be 
readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid 

25 molecule is then incubated with an appropriate host under appropriate conditions and the 
uptake of the marker sequence is determined. As described above, a UMF will increase the 
frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless 
the context dictates otherwise. 



30 



4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
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The isolated polynucleotides of the invention include a polynucleotide comprising 
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534; a polynucleotide encoding 
any one of the peptide sequences of SEQ ID NO: 1-1041, or 2083-2534; and a 
polynucleotide comprising the nucleotide sequence encoding the mature protein coding 
5 sequence of the polynucleotides of any one of SEQ ID NO: 1-1041, or 2083-2534. The 

polynucleotides of the present invention also include, but are not limited to, a polynucleotide 
that hybridizes under stringent conditions to (a) the complement of any of the nucleotides 
sequences of SEQ ID NO: 1-1041, or 2083-2534; (b) nucleotide sequences encoding any one 
of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c) a 
1 0 polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 

polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) 
a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of 
the polypeptides of SEQ ED NO: 1042-2082, or 2535-2986 (for example, as set forth in 
Tables 3, 5, 6, or 8). Domains of interest may depend on the nature of the encoded 
1 5 polypeptide; e.g., domains in receptor-like polypeptides include ligand-binding, 

extracellular, transmembrane, or cytoplasmic domains, or combinations thereof; domains in 
immunoglobulin-like proteins include the variable inununoglobulin-like domains; domains 
in enzyme-like polypeptides include catalytic and substrate binding domains; and domains in 
ligand polypeptides include receptor-binding domains. 
20 The polynucleotides of the invention include naturally occurring or wholly or 

partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 
polynucleotides may include entire coding region of the cDNA or may represent a portion of 
the coding region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences 
25 disclosed herein. The corresponding genes can be isolated in accordance with known methods 
using the sequence information disclosed herein. Such methods include the preparation of 
probes or primers from the disclosed sequence information for identification and/or 
amplification of genes in appropriate genomic libraries or other sources of genomic materials. 
Further 5' and 3' sequence can be obtained using methods known in the art. For example, full 
30 length cDNA or genomic DNA that corresponds to any of the polynucleotides of SEQ ID NO: 
1-1041, or 2083-2534 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1-1041, or 2083-2534 or a portion thereof as a probe. Alternatively, the polynucleotides of 
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SEQ ID NO: 1-1041, or 2083-2534 may be used as the basis for suitable primer(s) that aUow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
5 dbEST,gbpri,andUniGene. The EST sequences can provide identifying sequence 

information, representative fragment or segment information, or novel segment information for 

the full-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that are substantially equivalent to the polynucleotides recited above. 
10 Polynucleotides according to the invention can have, e.g., at least about 65%, at least about 
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least 
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, 
and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence identity to a 
polynucleotide recited above. 
15 Included within the scope of the nucleic acid sequences of the invention are nucleic 

acid sequence fragments that hybridize under stringent conditions to any of the nucleotide 
sequences of SEQ ID NO: 1-1041, or 2083-2534, or complements thereof, which fragment is 
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 
20 nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the 
polynucleotides of the invention are contemplated. Probes capable of specifically 
hybridizing to a polynucleotide can differentiate polynucleotide sequences of the invention 
from other polynucleotide sequences in the same family of genes or can differentiate human 
genes from genes of other species, and are preferably based on unique nucleotide sequences. 
25 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1- 
1041, or 2083-2534, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NO: 1-1041, or 20*3-2534 with a sequence from 
30 another isolate of the same species. Furthermore, to accommodate codon variability, the 

invention includes nucleic acid molecules coding for the same amino acid sequences as do the 
specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of 
one codon for another codon that encodes the same amino acid is expressly contemplated. 



Printed from Mimosa 05/1 1/28 15.16:42 Page: 19 



WO 03/080795 



PCT/US02/25485 



19 

The nearest neighbor or homology results for the nucleic acids of the present invention, 
including SEQ ID NO: 1-1041, or 2083-2534 can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST (Basic Local Alignment Search Tool) program is 
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
5 Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using FASTXY algorithm may be performed. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 
also provided by the present invention. Species homologs may be isolated and identified by 
making suitable probes or primers from the sequences provided herein and screening a 
10 suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which 
also encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

1 5 The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic 

20 acids encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These 
nucleic acid alterations can be made at sites that differ in the nucleic acids from different 
species (variable positions) or in highly conserved regions (constant regions). Sites at such 
locations will typically be modified in series, e.g. p by substituting first with conservative 

25 choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid) and then with 
more distant choices (e.g., hydrophobic amino acid to a charged amino acid), and then 
deletions or insertions may be made at the target site. Amino acid sequence deletions 
generally range from about 1 to 30 residues, preferably about 1 to 10 residues, and are 
typically contiguous. Amino acid insertions include amino- and/or caiboxyl-terminal 

30 fusions ranging in length from one to one hundred or more residues, as well as intrasequence 
insertions of single or multiple amino acid residues. Intrasequence insertions may range 
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues. Examples of 
terminal insertions include the heterologous signal sequences necessary for secretion or for 
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intracellular targeting in different host cells and sequences such as FLAG or poly-histidine 
sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter 
5 a polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 

nucleotides on both sides of the changed amino acid to form a stable duplex on either side of 
the site of being changed. In general, the techniques of site-directed mutagenesis are well 
known to those of skill in the art and this technique is exemplified by publications such as, 
Edelman et al, DNA 2:183 (1983). A versatile and efficient method for producing 

10 site-specific changes in a polynucleotide sequence was published by Zoller and Smith, 
Nucleic Acids Res. 10:6487-6500(1982). PCR may also be used to create amino acid 
sequence variants of the novel nucleic acids. When small amounts of template DNA are 
used as starting material, primer(s) that differs slightly in sequence from the corresponding 
region in the template DNA can generate the desired amino acid variant. PCR amplification 

1 5 results in a population of product DNA fragments that differ from the polynucleotide 

template encoding the polypeptide at the position specified by the primer. The product DNA 
fragments replace the corresponding region in the plasmid and this gives a polynucleotide 
encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 

20 technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques 
well known in the art, such as, for example, the techniques in Sambrook et al., supra, and 
Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of 
the genetic code, other DNA sequences which encode substantially the same or a 
functionally equivalent amino acid sequence may be used in the practice of the invention for 

25 the cloning and expression of these novel nucleic acids. Such DNA sequences include those 
which are capable of hybridizing to the appropriate novel nucleic acid sequence under 
stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention could be 
used to generate polynucleotides encoding chimeric or fusion proteins comprising one or 
30 more domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of 
the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
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polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate polynucleotides 
of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

5 protein coding sequences corresponding to any one of SEQ ID NO: 1-1041, or 2083-2534, 
or functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in appropriate 
host cells. Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

10 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et 
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). 
Useful nucleotide sequences for joining to polynucleotides include an assortment of vectors, 
e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well 
known in the art. Accordingly, the invention also provides a vector including a 

1 5 polynucleotide of the invention and a host cell containing the polynucleotide. In general, the 
vector contains an origin of replication functional in at least one organism, convenient 
restriction endonuclease sites, and a selectable marker for the host cell. Vectors according to 
the invention include expression vectors, replication vectors, probe generation vectors, and 
sequencing vectors. A host cell according to the invention can be a prokaryotic or 

20 eukaryotic cell and can be a unicellular organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic 
acid having any of the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 or a 
fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as aplasmid or viral 

25 vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1- 
1041, or 2083-2534 or a fragment thereof is inserted, in a forward or reverse orientation. In 
the case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably linked to 
the ORF. Large numbers of suitable vectors and promoters are known to those of skill in the 

30 art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example: Bacterial: pBs, 
phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a 
(Stratagene), P Trc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
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pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufinan et al, 

5 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. 
Many suitable expression control sequences are known in the art. General methods of 
expressing recombinant proteins are also known and are exemplified in R. Kaufinan, 
Methods in Enzymology 185, 537-566 (1990). As defined herein "operably linked" means 
that the isolated polynucleotide of the invention and an expression control sequence are 

10 situated within a vector or cell in such a way that the protein is expressed by a host cell 

which has been transformed (transfected) with the ligated polynucleotide/expression control 
sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

15 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate 
early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse 
metallothionein-L Selection of the appropriate vector and promoter is well within the level 
of ordinary skill in the art. Generally, recombinant expression vectors will include origins of 

20 replication and selectable markers permitting transformation of the host cell, e.g. , the 

ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived 
from a highly expressed gene to direct transcription of a downstream structural sequence. 
Such promoters can be derived from operons encoding glycolytic enzymes such as 3- 
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among 

25 others. The heterologous structural sequence is assembled in appropriate phase with 

translation initiation and termination sequences, and preferably, a leader sequence capable of 
directing secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an amino 
terminal identification peptide imparting desired characteristics, e.g., stabilization or 

30 simplified purification of expressed recombinant product. Useftd expression vectors for 
bacterial use are constructed by inserting a structural DNA sequence encoding a desired 
protein together with suitable translation initiation and termination signals in operable 
reading phase with a functional promoter. The vector will comprise one or more phenotypic 
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selectable markers and an origin of replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Strep tomyces, and Staphylococcus, although others may 
5 also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 

10 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. Following transformation of a suitable host strain 
and growth of the host strain to an appropriate cell density, the selected promoter is induced 
or derepressed by appropriate means (e.g., temperature shift or chemical induction) and cells 

15 are cultured for an additional period. Cells are typically harvested by centxifugatioii, 

disrupted by physical or chemical means, and the resulting crude extract retained for further 
purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat Biotech 17, 870-872 (1999), incorporated herein by 
20 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intra-muscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form 
of naked DNA. 

25 

43 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1-1041, or 2083-2534, or fragments, analogs or 
30 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g. y complementary to the 
coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a 
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sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an 
entire coding strand, or to only a portion thereof Nucleic acid molecules encoding 
fragments, homologs, derivatives and analogs of aprotein of any of SEQ ID NO: 1-1041, or 
2083-2534 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 

5 NO: 1-1041, or 2083-2534 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
of the invention. The term "noncoding region" refers to 5 r and 3' sequences that flank the < 
coding region that are not translated into amino acids also referred to as 5' and 3' 
untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g, 

15 SEQ ID NO: 1-1041, or 2083-2534, antisense nucleic acids of the invention can be designed 
according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic 
acid molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of an mRNA. For example, the antisense oligonucleotide can be complementary to 

20 the region surrounding the translation start site of an mRNA. An antisense oligonucleotide 
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An 
antisense nucleic acid of the invention can be constructed using chemical synthesis or 
enzymatic ligation reactions using procedures known in the art. For example, an antisense 
nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using 

25 naturally occurring nucleotides or variously modified nucleotides designed to increase the 
biological stability of the molecules or to increase the physical stability of the duplex formed 
between the antisense and sense nucleic acids, e.g, phosphorothioate derivatives and 
acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic 

30 acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5- 
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 
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1- methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- 
methylcytosins, 5-methylcytosine, N6-adenine, 7-methylguanine, 5- 
methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil 5 beta-D-mannosylqueosine, 
5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, 

5 uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl- 

2- thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, 
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 
(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced 
biologically using an expression vector into which a nucleic acid has been subcloned in an 

1 0 antisense orientation (i. e. , RNA transcribed from the inserted nucleic acid will be of an 
antisense orientation to a target nucleic acid of interest, described further in the following 
subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

1 5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of 
the protein, e.g. 9 by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific 
interactions in the major groove of the double helix. An example of a route of 

20 administration of antisense nucleic acid molecules of the invention includes direct injection 
at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target 
selected cells and then administered systemically. For example, for systemic administration, 
antisense molecules can be modified such that they specifically bind to receptors or antigens 
expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to 

25 peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic 
acid molecules can also be delivered to cells using the vectors described herein. To achieve 
sufficient intracellular concentrations of antisense molecules, vector constructs in which the 
antisense nucleic acid molecule is placed under the control of a strong pol II or pol m 
promoter are preferred. 

30 In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual a-units 3 
the strands run parallel to each other (Gaultier et ah (1987) Nucleic Acids Res 15: 
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6625-6641). The antisense nucleic acid molecule can also comprise a 
^-o-methylribonucleotide (Inoue et al (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al (1987) FEBS Lett 215: 327-330). 

5 4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes {e.g., hammerhead ribozymes (described in 

10 Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 

mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having specificity 
for a nucleic acid of the invention can be designed based upon the nucleotide sequence of a 
DNA disclosed herein (i.e., SEQ ID NO: 1-1041, or 2083-2534). For example, a derivative 
of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the 

1 5 active site is complementary to the nucleotide sequence to be cleaved in a mRNA. See, e.g. , 
Cech et al U.S. Pat No. 4,987,071; and Cech et al U.S. Pat. No. 5,1 16,742. Alternatively, 
mRNA of the invention can be used to select a catalytic RNA having a specific ribonuclease 
activity from a pool of RNA molecules. See, e.g., Bartel et al, (1993) Science 
261:1411-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g. , promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, Helene. 
(1991) Anticancer DrugDes. 6: 569-84; Helene. et al. (1992) Ann. NY. Acad. Sci. 
660:27-36; and Maher (1992) Bioassays 14: 807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the 

base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup 
et al. (1996) Bioorg Med Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" 

30 or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose 
phosphate backbone is replaced by a pseudopeptide backbone and only the four natural 
nucleobases are retained. The neutral backbone of PNAs has been shown to allow for 
specific hybridization to DNA and RNA under conditions of low ionic strength. The 
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synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis 
protocols as described in Hyrup et al (1996) above; Perry-O'Keefe et al (1996) PNAS 93: 
14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
5 example, PNAs can be used as antisense or antigene agents for sequence-specific modulation 
of gene expression by, e.g. 9 inducing transcription or translation arrest or inhibiting 
replication. PNAs of the invention can also be used, e.g., in the analysis of single base pair 
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes 
when used in combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); 

10 or as probes or primers for DNA sequence and hybridization (Hyrup et al (1 996), above; 
Perry-O'Keefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 

15 delivery known in the art For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of 

20 base stacking, number of bonds between the nucleobases, and orientation (Hyrup (1996) 
above). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup 
(1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain 
can be synthesized on a solid support using standard phosphoramidite coupling chemistry, 
and modified nucleoside analogs, e.g, 5 , -(4-methoxytrityl)amino-5 l -deoxy-thymidine 

25 phosphoramidite, can be used between the PNA and the 5* end of DNA (Mag et al (1989) 
NuclAcidRes 17: 5973-88). PNA monomers are then coupled in a stepwise manner to 
produce a chimeric molecule with a 5' PNA segment and a 3 f DNA segment (Finn et al. 
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5* DNA 
segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Chem Lett 5: 

30 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such 
as peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S. A. 
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86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication 
No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). 
In addition, oligonucleotides can be modified with hybridization triggered cleavage agents 
(See, e.g., Krol et al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., 
5* Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g. \ a peptide, a hybridization triggered cross-linking agent, a transport 
agent, a hybridization-triggered cleavage agent, etc. 

4.5 HOSTS 

1 0 The present invention further provides host cells genetically engineered to contain 

the polynucleotides of the invention. For example, such host cells may contain nucleic acids 
of the invention introduced into the host cell using known transformation, transfection or 
infection methods. The present invention still further provides host cells genetically 
engineered to express the polynucleotides of the invention, wherein such polynucleotides are 

15 in operative association with a regulatory sequence heterologous to the host cell which 
drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in 

20 whole or in part, the naturally occurring promoter with all or part of a heterologous promoter 
so that the cells express the polypeptide at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the encoding sequences. See, for 
example, PCT International Publication No. WO94/12650, PCT International Publication 
No. WO92/20808, and PCT International Publication No. WO91/09955. It is also 

25 contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA 
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate 
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be 
inserted along with the heterologous promoter DNA. If linked to the coding sequence, 
amplification of the marker DNA by standard selection methods results in co-amplification 

30 of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
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calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation 
(Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one 
of the polynucleotides of. the invention, can be used in conventional manners to produce the 
gene product encoded by the isolated fragment (in the case of an ORF) or can be used to 

5 produce a heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the 
present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, 
Cv~l cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and 
B. subtilis. The most preferred cells are those which do not normally express the particular 

1 0 polypeptide or protein or which expresses the polypeptide or protein at low natural level. 
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under 
the control of appropriate promoters. Cell-free translation systems can also be employed to 
produce such proteins using RNAs derived from the DNA constructs of the present 
invention. Appropriate cloning arid expression vectors for use with prokaryotic and 

15 eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of which is 
hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 

20 of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines 
capable of expressing a compatible vector are, for example, the CI 27, monkey COS cells, 
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431 cells, 
human Colo205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal 
diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, 

25 HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian expression 
vectors will comprise an origin of replication, a suitable promoter and also any necessary 
ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional 
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences derived 
from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, 

30 and polyadenylation sites may be used to provide the required nontranscribed genetic 

elements. Recombinant polypeptides and proteins produced in bacterial culture are usually 
isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous 
ion exchange or size exclusion chromatography steps. Protein refolding steps can be used, 
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as necessary, in completing configuration of the mature protein. Finally, high performance 
liquid chromatography (HPLC) can be employed for final purification steps. Microbial cells 
employed in expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. 
5 Alternatively, it may be possible to produce the protein in lower eukaryotes such as 

yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, 
or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia colU Bacillus subtilis, Salmonella typhimurium, or any bacterial 
1 0 strain capable of expressing heterologous proteins. If the protein is made in yeast or 
bacteria, it may be necessary to modify the protein produced therein, for example by 
phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional 
protein. Such covalent attachments may be accomplished using known chemical or 
enzymatic methods. 

15 in another embodiment of the present invention, cells and tissues may be engineered 

to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 
endogenous gene may be replaced by homologous recombination. As described herein, gene 
targeting can be used to replace a gene's existing regulatory region with a regulatory 

20 sequence, isolated from a different gene or a novel regulatory sequence synthesized by 

genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 
initiation sites, and regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the RNA or protein 

25 produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequence include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

30 The targeting event may be a simple insertion of the regulatory sequence, placing the 

gene under the control of the new regulatory sequence, e.g. t inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 
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element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added. In all cases, the identification of the 
5 targeting event may be facilitated by the use of one or more selectable marker genes that are 
contiguous with the targeting DNA, allowing for the selection of cells in which the 
exogenous DNA has integrated into the host cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 

1 0 DNA, but configured such that the negatively selectable marker flanks the targeting 

sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

15 The gene targeting or gene activation techniques which can be used in accordance 

with this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultohi et al, each of which is incorporated by 

20 reference herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1042- 

25 2082, or 2535-2986 or an amino acid sequence encoded by any one of the nucleotide 

sequences SEQ ID NO: 1-1041, or 2083-2534 or the corresponding full length or mature 
protein. Polypeptides of the invention also include polypeptides preferably with biological or 
immunological activity that are encoded by: (a) a polynucleotide having any one of the 
nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534 or (b) polynucleotides 

30 encoding any one of the amino acid sequences set forth as SEQ ID NO: 1042-2082, or 2535- 
2986 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either 
(a) or (b) under stringent hybridization conditions. The invention also provides biologically 
active or immunologically active variants of any of the amino acid sequences set forth as 
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SEQ ID NO: 1042-2082, or 2535-2986 or the corresponding full length or mature protein; 
and "substantial equivalents" thereof (e.g., with at least about 65%, at least about 70%, at 
least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least 
5 about 98%, or most typically at least about 99% amino acid identity) that retain biological 
activity. Polypeptides encoded by allelic variants may have a similar, increased, or 
decreased activity compared to polypeptides comprising SEQ ID NO: 1042-2082, or 2535- 
2986. 

Fragments of the proteins of the present invention which are capable of exhibiting 

10 biological activity are also encompassed by the present invention. Fragments of the protein 
may be in linear form or they may be cyclized using known methods, for example, as 
described in H. U. Saragovi, et a!., Bio/Technology 10, 773-778 (1992) and in R. S. 
McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are 
incorporated herein by reference. Such fragments may be fused to carrier molecules such as 

1 5 immunoglobulins for many purposes, including increasing the valency of protein binding 
sites. Fragments are also identified in Tables 3, 5, 6, and 8. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein 
coding sequence is identified in the sequence listing by translation of the disclosed 

20 nucleotide sequences. The predicted signal sequence is set forth in Table 6. The mature 
form of such protein may be obtained and confirmed by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell and sequencing of the cleaved 
product. One of skill in the art will recognize that the actual cleavage site may be different 
than that predicted in Table 6. The sequence of the mature form of the protein is also 

25 determinable from the amino acid sequence of the full-length form. Where proteins of the 
present invention are membrane bound, soluble forms of the proteins are also provided. In 
such forms, part or all of the regions causing the proteins to be membrane bound are deleted 
so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 

30 carrier, such as a hydrophilic, e.g. y pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic 
acid fragments of the present invention or by degenerate variants of the nucleic acid 
fragments of the present invention. By "degenerate variant" is intended nucleotide 
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fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) 
by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical 
polypeptide sequence. Preferred nucleic acid fragments of the present invention are the 
ORPs that encode proteins. 
5 A variety of methodologies known in the art can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino 
acid sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or 
tertiary structural and/or conformational characteristics with proteins may possess biological 
10 properties in common therewith, including protein activity. This technique is particularly 
useful in producing small peptides and fragments of larger polypeptides. Fragments are 
useful, for example, in generating antibodies against the native polypeptide. Thus, they may 
be employed as biologically active or immunological substitutes for natural, purified 
proteins in screening of therapeutic compounds and in immunological processes for the 
1 5 development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified 
from cells which have been altered to express the desired polypeptide or protein. As used 
herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, 
through genetic manipulation, is made to produce a polypeptide or protein which it normally 
20 does not produce or which the cell normally produces at a lower level. One skilled in the art 
can readily adapt procedures for introducing and expressing either recombinant or synthetic 
sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one 
of the polypeptides or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising 
25 growing a culture of host cells of the invention in a suitable culture medium, and purifying 
the protein from the cells or the culture in which the cells are grown. For example, the 
methods of the invention include a process for producing a polypeptide in which a host cell 
containing a suitable expression vector that includes a polynucleotide of the invention is 
cultured under conditions that allow expression of the encoded polypeptide. The 
30 polypeptide can be recovered from the culture, conveniently from the culture medium, or 
from a lysate prepared from the host cells and further purified. Preferred embodiments 
include those in which the protein produced by such process is a £UI1 length or mature form' 
of the protein. 
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In an alternative method, the polypeptide or protein is purified from bacterial cells 
which naturally produce the polypeptide or protein. One skilled in the art can readily follow 
known methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
5 immunochromatography, HPLC, size-exclusion chromatography, ion-exchange 
chromatography, and immuno-afQnity chromatography. See, e.g., Scopes, Protein 
Purification: Principles and Practice, Springer- Verlag (1994); Sambrook, et al, in 
Molecular Cloning: A Laboratory Manual, Ausubel et al., Current Protocols in Molecular 
Biology. Polypeptide fragments that retain biological/immunoiogical activity include 

10 fragments comprising greater than about 100 amino acids, or greater than about 200 amino 
acids, and fragments that encode specific protein domains. 

The purified polypeptides can be used in in vitro binding assays which are well 
known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 

15 libraries, antibodies or other proteins. The molecules identified in the binding assay are then 
tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 
In addition, the peptides of the invention or molecules capable of binding to the 

20 peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds that 
are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or other 
cell by the specificity of the binding molecule for SEQ ID NO: 1042-2082, or 2535-2986. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are 

25 characterized by somatic or germ cells containing a nucleotide sequence encoding the 
protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
30 sequence, can be made by those skilled in the art using known techniques. Modifications of 
interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For example, 
one or more of the cysteine residues may be deleted or replaced with another amino acid to 
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alter the conformation of the molecule. Techniques for such alteration, substitution, 
replacement, insertion or deletion are well known to those skilled in the art (see, e.g., U.S. 
Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, insertion or 
deletion retains the desired activity of the protein. Regions of the protein that are important 
5 for the protein function can be determined by various methods known in the art including the 
alanine-scanning method which involved systematic substitution of single or strings of 
amino acids with alanine, followed by testing the resulting alanine-containing variant for 
biological activity. This type of analysis determines the importance of the substituted amino 
acid(s) in biological activity. Regions of the protein that are important for protein function 

1 0 may be determined by the eMATRDC program. 

Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given the 
disclosures herein. Such modifications are encompassed by the present invention. 

1 5 The protein may also be produced by operably linking the isolated polynucleotide of 

the invention to suitable control sequences in one or more insect expression vectors, and 
employing an insect expression system. Materials and methods for baculovirus/insect cell 
expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, 
Calif., U.S.A. (the MaxBat™ kit), and such methods are well known in the art, as described 

20 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), 
incorporated herein by reference. As used herein, an insect cell capable of expressing a 
polynucleotide of the present invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells 
under culture conditions suitable to express the recombinant protein. The resulting 

25 expressed protein may then be purified from such culture {i.e., from culture medium or cell 
extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 
containing agents which will bind to the protein; one or more column steps over such a f fi n ity 
resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; 

30 one or more steps involving hydrophobic interaction chromatography using such resins as 
phenyl ether, butyl ether, or propyl ether; or immunoaf&oity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
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maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as 
a His tag. Kits for expression and purification of such fusion proteins are commercially 
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, NJ.) and 
Invitrogen, respectively. The protein can also be tagged with an epitope and subsequently 
5 purified by using a specific antibody directed to such epitope. One such epitope ("FLAG®") 
is commercially available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- 
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 
methyl or other aliphatic groups, can be employed to further purify the protein. Some or all 

10 of the foregoing purification steps, in various combinations, can also be employed to provide 
a substantially homogeneous isolated recombinant protein. The protein thus purified is 
substantially free of other mammalian proteins and is defined in accordance with the present 
invention as an "isolated protein." 

The polypeptides of the invention include analogs (variants). This embraces 

15 fragments, as well as peptides in which one or more amino acids has been deleted, inserted, 
or substituted. Also, analogs of the polypeptides of the invention embrace fusions of the 
polypeptides or modifications of the polypeptides of the invention, wherein the polypeptide 
or analog is fused to another moiety or moieties, e.g., targeting moiety or another therapeutic 
agent. Such analogs may exhibit improved properties such as activity and/or stability. 

20 Examples of moieties which may be fused to the polypeptide or an analog include, for 

example, targeting moieties which provide for the delivery of polypeptide to pancreatic cells, 
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-cells, monocytes, 
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed on pancreatic or 
immune cells. Other moieties which may be fused to the polypeptide include therapeutic 

25 agents which are used for treatment, for example, immunosuppressive drugs such as 

cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also, polypeptides maybe 
fused to immune modulators, and other cytokines such as alpha or beta interferon. 



4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
30 IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between 
the sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
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(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, 
S.F. et al, J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic 
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu 
5 et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif 
software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by 
reference), Pfam software (Sonnhammer et al, Nucleic Acids Res., Vol. 26(1), pp. 320-322 
(1998), herein incorporated by reference) and the Kyte-Doolittle hydrophobocity prediction 
algorithm (J. Mol Biol, 157, pp. 105-31 (1982), incoiporated herein by reference). 

10 polypeptide sequences were examined by a proprietary algorithm, SeqLoc that separates the 
proteins into three sets of locales: intracellular, membrane, or secreted. This prediction is 
based upon three characteristics of each polypeptide, including percentage of cysteine 
residues, Kyte-Doolittle scores for the first 20 amino acids of each protein, and Kyte- 
Doolittle scores to calculate the longest hydrophobic stretch of the said protein. Values of 

15 predicted proteins are compared against the values from a set of 592 proteins of known 

cellular localization from the Swissprot database (http ://www. expasy. ch/sprof) . Predictions 
are based upon the maximum likelihood estimation. 

The BLAST programs are publicly available from the National Center for 
Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul, S. 5 et al. 

20 NCBINLM N1H Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-410 
(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 

25 another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically 
active portions of a protein according to the invention. Within the fusion protein, the term 

30 "operatively linked" is intended to indicate that the polypeptide according to the invention 
and the other polypeptide are fused in-frame to each other. The polypeptide can be fused to 
the N-terminus or C-terminus, or to the middle. 
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For example, in one embodiment a fusion protein comprises a polypeptide according 
to the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
5 glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
which the polypeptide sequences according to the invention comprise one or more domains 
fused to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
10 compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in 
vivo. The immunoglobulin fusion proteins can be used to affect the bioavailability of a 
cognate ligand. Inhibition of the ligand/protein interaction may be useful therapeutically for 
both the treatment of proliferative and differentiative disorders, e.g., cancer as well as 
15 modulating (e.g., promoting or inhibiting) cell survival Moreover, the immunoglobulin 
fusion proteins of the invention can be used as immunogens to produce antibodies in a 
subject, to purify ligands, and in screening assays to identify molecules that inhibit the 
interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
20 recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction 
enzyme digestion to provide for appropriate termini, fiUing-in of cohesive ends as 
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
25 ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PGR amplification of 
gene fragments can be carried out using anchor primers that give rise to complementary 
overhangs between two consecutive gene fragments that can subsequently be annealed and 
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et al. (eds.) 
30 Current Protocols in Molecular Biology, John Wiley & Sons, 1 992). Moreover, 
many expression vectors are commercially available that already encode a fusion moiety 
(e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the invention can be 
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cloned into such an expression vector such that the fusion moiety is linked in-frame to the 
protein of the invention. 

4.8 GENE THERAPY 
5 Mutations in the polynucleotides of the invention gene may result in loss of normal 

function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides 
of the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more 

10 particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo 
by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for 
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For 
additional reviews of gene therapy technology see Friedmann, Science, 244: 1275-1281 
(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). 

1 5 Introduction of any one of the nucleotides of the present invention or a gene encoding the 
polypeptides of the present invention can also be accomplished with extrachromosomal 
substrates (transient expression) or artificial chromosomes (stable expression). Cells may 
also be cultured ex vivo in the presence of proteins of the present invention in order to 
proliferate or to produce a desired effect on or activity in such cells. Treated cells can then 

20 be introduced in vivo for therapeutic purposes. Alternatively, it is contemplated that in other 
human disease states, preventing the expression of or inhibiting the activity of polypeptides 
of the invention will be useful in treating the disease states. It is contemplated that antisense 
therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

25 Other methods inhibiting expression of a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated 
RNA sequences, by methods known in the art. Further, the polypeptides of the present 
invention can be inhibited by using targeted deletion methods, or the insertion of a negative 
regulatory element such as a silencer, which is tissue specific. 

30 The present invention still further provides cells genetically engineered in vivo to 

express the polynucleotides of the invention, wherein such polynucleotides are in operative 
association with a regulatory sequence heterologous to the host cell which drives expression of 
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the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 
5 modified (e.g., by homologous recombination) to provide increased polypeptide expression by 
replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous 
promoter so that the cells express the protein at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the desired protein encoding sequences. 
See, for example, PCT International Publication No. WO 94/12650, PCT International 

10 Publication No. WO 92/20808, and PCT International Publication No. WO 91/09955. It is also 
contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., 
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, 
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with 
the heterologous promoter DNA. If linked to the desired protein coding sequence, 

1 5 amplification of the marker DNA by standard selection methods results in co-amplification of 
the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control 
of inducible regulatory elements, in which case the regulatory sequences of the endogenous 

20 gene may be replaced by homologous recombination. As described herein, gene targeting can 
be used to replace a gene's existing regulatory region with a regulatory sequence isolated from 
a different gene or a novel regulatory sequence synthesized by genetic engineering methods. 
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment 
regions, negative regulatory elements, transcriptional initiation sites, regulatory protein binding 

25 sites or combinations of said sequences. Alternatively, sequences which affect the structure or 
stability of the RNA or protein produced may be replaced, removed, added, or otherwise 
modified by targeting. These sequences include polyadenylation signals, mRNA stability 
elements, splice sites, leader sequences for enhancing or modifying transport or secretion 
properties of the protein, or other sequences which alter or improve the function or stability of 

30 protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
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deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 
element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type 
specificity than the naturally occurring elements. Here, the naturally occurring sequences are 
5 deleted and new sequences are added. In all cases, the identification of the targeting event may 
be facilitated by the use of one or more selectable marker genes that are contiguous with the 
targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated 
into the cell genome. The identification of the targeting event may also be facilitated by the use 
of one or more marker genes exhibiting the property of negative selection, such that the 

1 0 negatively selectable marker is linked to the exogenous DNA, but configured such that the 

negatively selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

1 5 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

20 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
25 invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination 
30 are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 
biological processes, and preferably in disease states. Transgenic animals are useful as model 
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systems to identify compounds that modulate lipid metabolism. Transgenic animals, 
preferably non-human mammals, are produced using methods as described in U.S. Patent No 
5,489,743 and PCT Publication No. W094/28122, incoiporated herein by reference. 
Transgenic animals can be prepared wherein all or part of a promoter of the 
5 polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased protein 
expression. The homologous promoter can be supplemented by insertion of one or more 

10 heterologous enhancer elements known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
express polypeptides of the invention or that express a variant polypeptide. Such animals are 
useful as models for studying the in vivo activities of polypeptide as well as for studying 

1 5 modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

20 control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination 
are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 

25 biological processes, and preferably in disease states. Transgenic animals are useful as model 
systems to identify compounds that modulate lipid metabolism. Transgenic animals, 
preferably non-human mammals, are produced using methods as described in U.S. Patent No 
5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 

30 invention promoter.is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 
even replacing the homologous promoter to provide for increased protein expression. The 
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homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 

4,10 USES AND BIOLOGICAL ACTIVITY 

5 The polynucleotides and proteins of the present invention are expected to exhibit one 

or more of the uses or biological activities (including those associated with assays cited 
herein) identified herein. Uses or activities described for proteins of the present invention 
maybe provided by administration or use of such proteins or of polynucleotides encoding 
such proteins (such as, for example, in gene therapies or vectors suitable for introduction of 

1 0 DNA). The mechanism underlying the particular condition or pathology will dictate whether 
the polypeptides of the invention, the polynucleotides of the invention or modulators 
(activators or inhibitors) thereof would be beneficial to the subject in need of treatment. 
Thus, "therapeutic compositions of the invention" include compositions comprising isolated 
polynucleotides (including recombinant DNA molecules, cloned genes and degenerate 

15 variants thereof) or polypeptides of the invention (including full length protein, mature 
protein and truncations or domains thereof), or compounds and other substances that 
modulate the overall activity of the target gene products, either at the level of target 
gene/protein expression or target protein activity. Such modulators include polypeptides, 
analogs, (variants), including fragments and fusion proteins, antibodies and other binding 

20 proteins; chemical compounds that directly or indirectly activate or inhibit the polypeptides 
of the invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 
antibodies or other binding partners that specifically recognize one or more epitopes of the 
polypeptides of the invention. 

25 The polypeptides of the present invention may likewise be involved in cellular 

activation or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
30 community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage 
of tissue differentiation or development or in disease states); as molecular weight markers on 



Printed from Mimosa 05/1 1/28 15: 17: 13 Page: 44 



WO 03/080795 



PCT7US02/25485 



44 

gels; as chromosome markers or tags (when labeled) to identify chromosomes or to map 
related gene positions; to compare with endogenous DNA sequences in patients to identify 
potential genetic disorders; as probes to hybridize and thus discover novel, related DNA 
sequences; as a source of information to derive PCR primers for genetic fingerprinting; as a 
5 probe to "subtract-out" known sequences in the process of discovering other novel 

polynucleotides; for selecting and making oligomers for attachment to a "gene chip" or other 
support, including for examination of expression patterns; to raise anti-protein antibodies 
using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 

10 potentially binds to another protein (such as, for example, in a receptor-ligand interaction), 
the polynucleotide can also be used in interaction trap assays (such as, for example, that 
described in Gyuris et al, Cell 75:791-803 (1993)) to identify polynucleotides encoding the 
other protein with which binding occurs or to identify inhibitors of the binding interaction. 
The polypeptides provided by the present invention can similarly be used in assays to 

15 determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including 
the labeled reagent) in assays designed to quantitatively determine levels of the protein (or 
its receptor) in biological fluids; as markers for tissues in which the corresponding 
polypeptide is preferentially expressed (either constitutively or at a particular stage of tissue 

20 differentiation or development or in a disease state); and, of course, to isolate correlative 
receptors or ligands. Proteins involved in these binding interactions can also be used to 
screen for peptide or small molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent 
grade or kit format for commercialization as research products. 

25 Methods for performing the uses listed above are well known to those skilled in the 

art. References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. 
Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular 
Cloning Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

30 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
nutritional sources or supplements. Such uses include without limitation use as a protein or 
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amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source of 
carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be added to 
the feed of a particular organism or can be administered as a separate solid or liquid 
preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the case 
5 of microorganisms, the polypeptide or polynucleotide of the invention can be added to the 
medium in or on which the microorganism is cultured 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

1 0 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
inhibiting) activity or may induce production of other cytokines in certain cell populations. 
A polynucleotide of the invention can encode a'polypeptide exhibiting such attributes. 
Many protein factors discovered to date, including all known cytokines, have exhibited 

15 activity in one or more factor-dependent cell proliferation assays, and hence the assays serve 
as a convenient confirmation of cytokine activity. The activity of therapeutic compositions 
of the present invention is evidenced by any one of a number of routine factor dependent cell 
proliferation assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, 
B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, 
20 Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions of the invention can be used in 
the following: 

Assays for T-cell or thymocyte proliferation include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

25 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al, J. Immunol 137:3494-3500, 1986; 
Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 
133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I 
Immunol. 152:1756-1761, 1994. 

30 Assays for cytokine production and/or proliferation of spleen cells, lymph node cells 

or thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of 
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mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. 
e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine 
5 Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al, 
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 
1983; Measurement of mouse and human interleukin 6— Nordan, R. In Current Protocols in 

10 Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; 
Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of human 
Interleukin 11-Bennett, R, Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols 
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991; 
Measurement of mouse and human Interleukin 9-Ciarletta, A., Giannotti, J., Clark, S. C. 

15 and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, 
John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, 
proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 
proliferation and cytokine production) include, without limitation, those described in: 

20 Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, 
E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-Interscience 
(Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 6, Cytokines and their 
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et al., Proc. 
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun. 11:405-411, 

25 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity 
30 and be involved in the proliferation, differentiation and survival of pluripotent and totipotent 
stem cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells 
and/or germ line stem cells. Administration of the polypeptide of the invention to stem cells 
in vivo or ex vivo is expected to maintain and expand cell populations in a totipotential or 
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pluripotential state which would be useful for re-engineering damaged or diseased tissues, 
transplantation, manufacture of bio-pharmaceuticals and the development of bio-sensors. 
The ability to produce large quantities of human cells has important working applications for 
the production of human proteins which currently must be obtained from non-human sources 
5 or donors, implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

10 It is contemplated that multiple different exogenous growth factors and/or cytokines 

may be administered in combination with the polypeptide of the invention to achieve the 
desired effect, including any of the growth factors listed herein, other stem cell maintenance 
factors, and specifically including stem cell factor (SCF), leukemia inhibitory factor (LBF), 
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL- 

15 6, macrophage inflammatory protein 1-alpha (MEP-1 -alpha), G-CSF, GM-CSF, 

thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor (PDGF), 
neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion 
of these cells in culture will facilitate the production of large quantities of mature cells. 

20 Techniques for culturing stem cells are known in the art and administration of polypeptides 
of the invention, optionally with other growth factors and/or cytokines, is expected to 
enhance the survival and proliferation of the stem cell populations. This can be 
accomplished by direct administration of the polypeptide of the invention to the culture 
medium. Alternatively, stroma cells transfected with a polynucleotide that encodes for the 

25 polypeptide of the invention can be used as a feeder layer for the stem cell populations in 
culture or in vivo. Stromal support cells fori feeder layers may include embryonic bone 
marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured embryonic 
fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 

30 induce autocrine expression of the polypeptide of the invention. This will allow for 

generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as is 
or that can then be differentiated into the desired mature cell types. These stable cell lines 
can also serve as a source of undifferentiated totipotential/pluripotential mRNA to create 



Printed from Mimosa 05/1 1/28 15: 17: 18 Page: 48 



WO 03/080795 



PCT/US02/25485 



48 

cDNA libraries and templates for polymerase chain reaction experiments. These studies 
would allow for the isolation and identification of differentially expressed genes in stem cell 
populations that regulate stem cell proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
5 treatment of many pathological conditions. For example, polypeptides of the present 

invention may be used to manipulate stem cells in culture to give rise to neuroepithelial cells 
that can be used to augment or replace cells damaged by illness, autoimmune disease, 
accidental damage or genetic disorders. The polypeptide of the invention maybe useful for 
inducing the proliferation of neural cells and for the regeneration of nerve and brain tissue, 

10 i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as 
well as mechanical and traumatic disorders which involve degeneration, death or trauma to 
neural cells or nerve tissue. In addition, the expanded stem cell populations can also be 
genetically altered for gene therapy purposes and to decrease host rejection of replacement 
tissues after grafting or implantation. 

1 5 Expression of the polypeptide of the invention and its effect on stem cells can also be 

manipulated to achieve controlled differentiation of the stem cells into more differentiated 
cell types. A broadly applicable method of obtaining pure populations of a specific 
differentiated cell type from undifferentiated stem cell populations involves the use of a cell- 
type specific promoter driving a selectable marker. The selectable marker allows only cells 

20 of the desired type to survive. For example, stem cells can be induced to differentiate into 
cardiomyocytes (Wobus et aL, Differentiation, 48: 173-182, (1991); Klug et al, J. Clin. 
Invest, 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. In: Principles of 
Tissue Engineering eds. Lanza et al, Academic Press (1997)). Alternatively, directed 
differentiation of stem cells can be accomplished by culturing the stem cells in the presence 

25 of a differentiation factor such as retinoic acid and an antagonist of the polypeptide of the 
invention which would inhibit the effects of endogenous stem cell factor activity and allow 
differentiation to proceed. , 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 
invention exhibits stem cell growth factor activity. Stem cells are isolated from any one of 

30 various cell sources (including hematopoietic stem cells and embryonic stem cells) and 
cultured on a feeder layer, as described by Thompson et al Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
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invention to induce stem cells proliferation is determined by colony formation on semi-solid 
support e.g. as described by Bernstein et aL, Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

5 A polypeptide of (he present invention may be involved in regulation of 

hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 
Even marginal biological activity in support of colony forming cells or of factor-dependent 
cell lines indicates involvement in regulating hematopoiesis, e.g. in supporting the growth 
and proliferation of erythroid progenitor cells alone or in combination with other cytokines, 

10 thereby indicating utility, for example, in treating various anemias or for use in conjunction 
with irradiation/chemotherapy to stimulate the production of erythroid precursors and/or 
erythroid cells; in supporting the growth and proliferation of myeloid cells such as 
granulocytes and monocytes/macrophages (i.e., traditional CSF activity) useftd, for example, 
in conjunction with chemotherapy to prevent or treat consequent myelo-suppression; in 

1 5 supporting the growth and proliferation of megakaryocytes and consequently of platelets 
thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet 
transfusions; and/or in supporting the growth and proliferation of hematopoietic stem cells 
which are capable of maturing to any and all of the above-mentioned hematopoietic cells and 

20 therefore find therapeutic utility in various stem cell disorders (such as those usually treated 
with transplantation, including, without limitation, aplastic anemia and paroxysmal nocturnal 
hemoglobinuria), as well as in repopulating the stem cell compartment post 
irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or 

25 heterologous)) as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 
Suitable assays for proliferation and differentiation of various hematopoietic lines are 
cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
30 proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., 
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 
1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. 
R. L Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; 
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic 
colony forming cells with high proliferative potential, McNiece, I. K. and Briddell, R. A. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., 
New Yoric, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; 
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of Hematopoietic Cells. 

10 R- 1. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y. 1994; Long term 
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter, M. and Allen, 
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, 
Inc., New York, N.Y. 1994; Long term culture initiating cell assay, Sutherland, H. J. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., 

15 New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, 
tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing and 

20 tissue repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 

25 prophylactic use in closed as well as open fracture reduction and also in the improved 
fixation of artificial joints. De novo bone formation induced by an osteogenic agent 
contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming 

30 cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by 
blocking inflammation or processes of tissue destruction (collagenase activity, osteoclast 
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activity, etc.) mediated by inflammatory processes may also be possible using the 
composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide of 
the present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue 

5 or other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or 
ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 

1 0 ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. De 
novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or ligament 
defects of other origin, and is also useful in cosmetic plastic surgery for attachment or repair 
of tendons or ligaments. The compositions of the present invention may provide 

1 5 environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 

ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to 
effect tissue repair. The compositions of the invention may also be useful in the treatment of 
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The compositions 

20 may also include an appropriate matrix and/or sequestering agent as a carrier as is well 
known in the art. 

The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 

25 traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 

30 syndrome. Further conditions which may be treated in accordance with the present invention 
include mechanical and traumatic disorders, such as spinal cord disorders, head trauma and 
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from 
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chemotherapy or other medical therapies may also be treatable using a composition of the 
invention. 

Compositions of the invention may also be useful to promote better or faster closure 
of non-healirig wounds, including without limitation pressure ulcers, ulcers associated with 
5 vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
(including vascular endothelium) tissue, or for promoting the growth of cells comprising 
10 such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 

scarring may allow normal tissue to regenerate. A polypeptide of the present invention may 
also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
15 conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
20 Assays for tissue generation activity include, without limitation, those described in: 

International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International 
Patent Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: 
25 Winter, Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), 

Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. 
Dermatol 71:382-84 (1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

30 A polypeptide of the present invention may also exhibit immune stimulating or 

immune suppressing activity, including without limitation the activities for which assays are 
described herein. A polynucleotide of the invention can encode a polypeptide exhibiting 
such activities. A protein may be useful in the treatment of various immune deficiencies and 
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disorders (including severe combined immunodeficiency (SCID)), e.g., in regulating (up or 
down) growth and prohferation of T and/or B lymphocytes, as well as effecting the cytolytic 
activity of NK cells and other cell populations. These immune deficiencies maybe genetic 01 
be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from 
5 autoimmune disorders. More specifically, infectious diseases causes by viral, bacterial, 
fungal or other infection may be treatable using a protein of the present invention, including 
infections by HTV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania spp., malaria 
spp. and various fungal infections such as candidiasis. Of course, in this regard, proteins of 
the present invention may also be useful where a boost to the immune system generally may 
10 be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus 
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre 
syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, 
1 5 graft-versus-host disease and autoimmune inflammatory eye disease. Such a protein (or 
antagonists thereof including antibodies) of the present invention may also to be useful in 
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum sickness, drug 
reactions, food allergies, insect venom allergies, mastocytosis, allergic rhinitis, 
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic dermatitis, allergic 
20 contact dermatitis, erythema multiforme, Stevens- Johnson syndrome, allergic conjunctivitis, 
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary conjunctivitis and 
contact allergies), such as asthma (particularly allergic asthma) or other respiratory 
problems. Other conditions, in which immune suppression is desired (including, for 
example, organ transplantation), may also be treatable using a protein (or antagonists 
25 thereof) of the present invention. The therapeutic effects of the polypeptides or antagonists 
thereof on allergic reactions can be evaluated by in vivo animals models such as the 
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 1998), skin 
prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization test 
(Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
30 J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the induction of 
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an immune response. The functions of activated T cells may be inhibited by suppressing T 
cell responses or by inducing specific tolerance in T cells, or both. Immunosuppression of T 
cell responses is generally an active, non-antigen-specific, process which requires continuous 
exposure of the T cells to the suppressive agent. Tolerance, which involves inducing 
5 non-responsiveness or anergy in T cells, is distinguishable from immunosuppression in that 
it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased. 
Operationally, tolerance can be demonstrated by the lack of a T cell response upon 
reexposure to specific antigen in the absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

10 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin 
and organ transplantation and in graft-versus-host disease (GVHD). For example, blockage 
of T cell function should result in reduced tissue destruction in tissue transplantation. 
Typically, in tissue transplants, rejection of the transplant is initiated through its recognition 

1 5 as foreign by T cells, followed by an immune reaction that destroys the transplant The 

administration of a therapeutic composition of the invention may prevent cytokine synthesis 
by immune cells, such as T cells, and thus acts as an immunosuppressant. Moreover, a lack 
of costimulation may also be sufficient to anergize the T cells, thereby inducing tolerance in 
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may 

20 avoid the necessity of repeated administration of these blocking reagents. To achieve 

sufficient immunosuppression or tolerance in a subject, it may also be necessary to block the 
function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

25 humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. 
Sci USA, 89:1 1 102-1 1 105 (1992). In addition, murine models of GVHD (see Paul ed, 

30 Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used to 

determine the effect of therapeutic compositions of the invention on the development of that 
disease. 
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Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate activation 
of T cells that are reactive against self-tissue and which promote the production of cytokines 
and autoantibodies involved in the pathology of the diseases. Preventing the activation of 
5 autoreactive T cells may reduce or eliminate disease symptoms. Administration of reagents 
which block stimulation of T cells can be used to inhibit T cell activation and prevent 
production of autoantibodies or T cell-derived cytokines which may be involved in the 
disease process. Additionally, blocking reagents may induce antigen-specific tolerance of 
autoreactive T cells which could lead to long-term relief from the disease. The efficacy of 

10 blocking reagents in preventing or alleviating autoimmune disorders can be determined 
using a number of well-characterized animal models of human autoimmune diseases. 
Examples include murine experimental autoimmune encephalitis, systemic lupus 
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune collagen 
arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental myasthenia 

15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or eliciting 

20 an initial immune response. For example, enhancing an immune response may be useful in 
cases of viral infection, including systemic viral diseases such as influenza, the common 
cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 

25 APCs either expressing a peptide of the present invention or together with a stimulatory 
form of a soluble peptide of the present invention and reintroducing the in vitro activated T 
cells into the patient. Another method of enhancing anti-viral immune responses would be to 
isolate infected cells from a patient, transfect them with a nucleic acid encoding a protein of 
the present invention as described herein such that the cells express all or a portion of the 

30 protein on their surface, and reintroduce the transfected cells into the patient The infected 
cells would now be capable of delivering a costimulatory signal to, and thereby activate, T 
cells in vivo. 
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A polypeptide of the present invention may provide the necessary stimulation signal 
to T cells to induce a T cell mediated immune response against the transfected tumor cells. 
In addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected 
5 with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) 
of an MHC class I alpha chain protein and P2 microglobulin protein or an MHC class II 
alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I 
or MHC class II proteins on the cell surface. Expression of the appropriate class I or class II 
MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e.g., 

10 B7-1, B7-2, B7-3) induces a T cell mediated immune response against the transfected tumor 
cell. Optionally, a gene encoding an antisense construct which blocks expression of an MHC 
class II associated protein, such as the invariant chain, can also be cotransfected with a DNA 
encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of 
tumor associated antigens and induce tumor specific immunity. Thus, the induction of a T 

15 cell mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

20 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et aL, Proc. Natl. Acad. ScL USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et aL, J. 

25 Immunol. 135:1564-1572, 1985; Takai et aL, I. Immunol. 137:3494-3500, 1986; Takai et aL, 
J. Immunol. 140:508-512, 1988; Bowman et aL, J. Virology 61:1992-1998; Bertagnolli et 
aL, Cellular Immunology 133:327-341, 1991; Brown et aL, J. Immunol. 153:3079-3092, 
1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching 
30 (which will identify, among others, proteins that modulate T-cell dependent antibody 

responses and that affect Thl/Th2 profiles) include, without limitation, those described in: 
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro 
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antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. 
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
5 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 
10 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al, J. Immunol. 134:536-544, 1995; Inaba et al., Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 

15 154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al, Journal of Virology 67:4062-4069, 1993; Huang et al., Science 
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et al., 
Journal of Experimental Medicine 172:631-640, 1990. 

20 Assays for lymphocyte survival/apoptosis (which will identify, among others, 

proteins that prevent apoptosis after superantigen induction and proteins that regulate 
lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz et 
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et 
al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, 

25 Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; 
Gorczyca et al., International Journal of Oncology 1 :639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine 
et al., Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; 

30 Told et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTRTN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate 
5 the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present 

invention, alone or in heterodimers with a member of the inhibin family, may be useful as a 
contraceptive based on the ability of inhibins to decrease fertility in female mammals and 
decrease spermatogenesis in male mammals. Administration of sufficient amounts of other 
inhibins can induce infertility in these mammals. Alternatively, the polypeptide of the 

10 invention, as a homodimer or as a heterodimer with other protein subunits of the inhibin 
group, may be useful as a fertility inducing therapeutic, based upon the ability of activin 
molecules in stimulating FSH release from cells of the anterior pituitary. See, for example, 
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for advancement 
of the onset of fertility in sexually immature mammals, so as to increase the lifetime 

15 reproductive performance of domestic animals such as, but not limited to, cows, sheep and 
pigs. 

The activity of a polypeptide of the invention may, among other means, be measured 
by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
20 Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321 :779-782, 1986; Vale et 
al. 5 Nature 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. 
Natl. Acad. Sci. USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

25 A polypeptide of the present invention may be involved in chemotactic or 

chemokinetic activity for mammalian cells, including, for example, monocytes, fibroblasts, 
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. 
Chemotactic and chemokinetic receptor activation can be used to mobilize or attract a 

30 desired cell population to a desired site of action. Chemotactic or chemokinetic compositions 
(e.g. proteins, antibodies, binding partners, or modulators of the invention) provide particular 
advantages in treatment of wounds and other trauma to tissues, as well as in treatment of 
localized infections. For example, attraction of lymphocytes, monocytes or neutrophils to 
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tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell 
5 population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population of 
cells can be readily determined by employing such protein or peptide in any known assay for 
cell chemotaxis. 

Therapeutic compositions of the invention can be used in the following: 
10 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. 
15 Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene 

Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta 
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. 
APMIS 103:140-146, 1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of 
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

20 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders 

25 (including hereditary disorders, such as hemophilias) or to enhance coagulation and other 
hemostatic events in treating wounds resulting from trauma, surgery or other causes. A 
composition of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 
example, infarction of cardiac and central nervous system vessels (e.g., stroke). 

30 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis 
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Res. 45:413-419, 1987; Humphrey et al, Fibrinolysis 5:71-79 (1991); Schaub, 
Prostaglandins 35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, proliferation 

or metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. 
For example, the presence or increased expression of a polynucleotide/polypeptide of the 
invention may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing 
10 malignancy. Conversely, a defect in the gene or absence of the polypeptide may be 
associated with a cancer condition. Identification of single nucleotide polymorphisms 
associated with cancer or a predisposition to cancer may also be useful for diagnosis or 
prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

1 5 inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor 
growth) and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. 
Therapeutic compositions of the invention may be effective in adult and pediatric oncology 
including in solid phase tumors/malignancies, locally advanced tumors, human soft tissue 
sarcomas, metastatic cancer, including lymphatic metastases, blood cell malignancies 

20 including multiple myeloma, acute and chronic leukemias, and lymphomas, head and neck 
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers including 
small cell carcinoma and non-small cell cancers, breast cancers including small cell 
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal cancer, 
stomach cancer, colon cancer, colorectal cancer and polyps associated with colorectal 

25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including bladder cancer and 
prostate cancer, malignancies of the female genital tract including ovarian carcinoma, uterine 
(including endometrial) cancers, and solid tumor in the ovarian follicle, kidney cancers 
including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

30 nervous system, bone cancers including osteomas, skin cancers including malignant 

melanoma, tumor progression of human skin keratinocytes, squamous cell carcinoma, basal 
cell carcinoma, hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
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(including inhibitors and stimulators of the biological activity of the polypeptide of the 
invention) may be administered to treat cancer. Therapeutic compositions can be 
administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thennotherapy, and laser 
5 therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of tumor 
growth, inhibiting metastasis, or otherwise improving overall clinical condition, without 
necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 

10 modulator of the invention with one or more anti-cancer drugs in addition to a 

pharmaceutical^ acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer 
treatment is routine. Anti-cancer drugs that are well known in the art and can be used as a 
treatment in combination with the polypeptide or modulator of the invention include: 
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, Carboplatin, 

1 5 Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytaxabine HC1 

(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, Doxorubicin HC1, 
Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5-Fluorouracil (5-Fu), 
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-2a, Interferon 
Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), Lomustine, Mechlorethamine 

20 HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX), 

Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, Streptozocin, 
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
Semustine, Teniposide, and Vindesine sulfate. 

25 In addition, therapeutic compositions of the invention may be used for prophylactic 

treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing 
cancers. Under these circumstances, it may be beneficial to treat these individuals with 
therapeutically effective doses of the polypeptide of the invention to reduce the risk of 

30 developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays 
of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) 
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Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden Chamber assays 
as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis 
5 assays such as induction of vascularization of the chick chorioallantoic membrane or 
induction of vascular endothelial cell migration as described in Ribatta et al., Intl. J. Dev. 
Biol, 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9 (1999), respectively. 
Suitable tumor cells lines are available, e.g. from American Type Tissue Culture Collection 
catalogs. 

10 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of 
the invention can encode a polypeptide exhibiting such characteristics. Examples of such 

1 5 receptors and ligands include, without limitation, cytokine receptors and their ligands, 
receptor kinases and their ligands, receptor phosphatases and their ligands, receptors 
involved in cell-cell interactions and their ligands (including without limitation, cellular 
adhesion molecules (such as selectins, integrins and their ligands) and receptor/ligand pairs 
involved in antigen presentation, antigen recognition and development of cellular and 

20 humoral immune responses. Receptors and ligands are also useful for screening of potential 
peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of 
the present invention (including, without limitation, fragments of receptors and ligands) may 
themselves be useful as inhibitors of receptor/ligand interactions. 

The activity of a polypeptide of the invention may, among other means, be measured 

25 by the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- 
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static conditions 

30 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., 
J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160 1989; 
Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be 
identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or 

a partial antagonist require the use of other proteins as competing ligands. The polypeptides 
of the present invention or ligand(s) thereof may be labeled by being coupled to 
radioisotopes, colorimetric molecules or a toxin molecules by conventional methods. 
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 1 82 
10 (1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but are not 
limited to, tritium and carbon- 14 . Examples of colorimetric molecules include, but are not 
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other colorimetric 
molecules. Examples of toxins include, but are not limited, to ricin. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 

20 method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 
transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
Such cells, either in viable or fixed form, can be used for standard binding assays. One may 
measure, for example, the formation of complexes between polypeptides of the invention or 

25 fragments and the agent being tested or examine the diminution in complex formation 

between the novel polypeptides and an appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate 
(i.e., increase or decrease) the activity of polypeptides of the invention include (1) inorganic 
and organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 
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The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or 
marine microorganisms or (2) extraction of the organisms themselves. Natural product 
5 libraries include polyketides, non-ribosomal peptides, and (non-naturally occurring) variants 
thereof. For a review, see Science 2S2:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides 
or organic compounds and can be readily prepared by traditional automated synthesis 
methods, PCR, cloning or proprietary synthetic methods. Of particular interest are peptide 

10 and oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, 
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide 
libraries. For a review of combinatorial chemistry and libraries created therefrom, see 
Myers, Curr. Opin. Biotechnol. 8:701-707 (1997). For reviews and examples of 
peptidomimetic libraries, see Al-Obeidi et al., Mol Biotechnol, 9(3);205-23 (1998); Hruby 

15 et al., Curr Opin Chem Biol, 1(1):114-19 (1997); Dorner et al., BioorgMed Chem, 
4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein 
permits modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" 
to bind a polypeptide of the invention. The molecules identified in the binding assay are then 

20 tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The 

25 toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity of 
the binding molecule for a polypeptide of the invention. Alternatively, the binding 
molecules may be complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

30 The invention also provides methods to detect specific binding of a polypeptide e.g. a 

ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For 
example, expression cloning using mammalian or bacterial cells, or dihybrid screening 
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assays can be used to identify polynucleotides encoding binding partners. As another 
example, affinity chromatography with the appropriate immobilized polypeptide of the 
invention can be used to isolate polypeptides that recognize and bind polypeptides of the 
invention. There are a number of different libraries used for the identification of 
5 compounds, and in particular small molecules, that modulate (z.e., increase or decrease) 

biological activity of a polypeptide of the invention. Ligands for receptor polypeptides of the 
invention can also be identified by adding exogenous ligands, or cocktails of ligands to two 
cells populations that are genetically identical except for the expression of the receptor of the 
invention: one cell population expresses the receptor of the invention whereas the other does 

10 not. The responses of the two cell populations to the addition of ligands(s) are then 

compared. Alternatively, an expression library can be co-expressed with the polypeptide of 
the invention in cells and assayed for an autocrine response to identify potential ligand(s). As 
still another example, BIAcore assays, gel overlay assays, or other methods known in the art 
can be used to identify binding partner polypeptides, including, (1 ) organic and inorganic 

15 chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of 
random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of 
the polypeptide of the invention can be determined. For example, a chimeric protein in 
which the cytoplasmic domain of the polypeptide of the invention is fused to the 

20 extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
phosphorylation. Other methods known to those in the art can also be used to identify 

25 signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. 
The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in 
30 the inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for 
example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or 
suppressing production of other factors which more directly inhibit or promote an 
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inflammatory response. Compositions with such activities can be used to treat inflammatory 
conditions including chronic or acute conditions), including without limitation intimation 
associated with infection (such as septic shock, sepsis or systemic inflammatory response 
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, 
5 complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-induced lung 
injury, inflammatory bowel disease, Crohn's disease or resulting from over production of 
cytokines such as TNF or IL-1. Compositions of the invention may also be usefiil to treat 
anaphylaxis and hypersensitivity to an antigenic substance or material. Compositions of this 
invention may be utilized to prevent or treat conditions such as, but not limited to, sepsis, 

10 acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid arthritis, chronic 
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, graft versus 
host disease, inflammatory bowel disease, inflamation associated with pulmonary disease, 
other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

1 5 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of 
20 the invention. Such leukemias and related disorders include but are not limited to acute 
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, 
promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic 
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such 
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

25 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
30 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention include 



Printed from Mimosa 05/11/28 15:17:41 Page: 67 



WO 03/080795 



PCT/US02/25485 



but are not limited to the following lesions of either the central (including spinal cord, brain) 
or peripheral nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated 
with surgery, for example, lesions which sever a portion of the nervous system, or 

5 compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
10 injured as a result of infection, for example, by an abscess or associated with infection by 

human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration 

15 associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of 
the nervous system is destroyed or injured by a nutritional disorder or disorder of 
metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 

20 Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 
carcinoma, or sarcoidosis; 

25 (vii) lesions caused by toxic substances including alcohol, lead, or particular 

neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various 
30 etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival 
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or differentiation of neurons. For example, and not by way of limitation, therapeutics which 
elicit any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (hi) increased production of a neuron-associated molecule in culture or in vivo, 

e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method 

10 set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons 
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or 
Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 

1 5 neuron dysfunction may be measured by assessing the physical manifestation of motor 

neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to 
toxin, trauma, surgical damage, degenerative disease or malignancy that may affect motor 

20 neurons as well as other components of the nervous system, as well as disorders that 

selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited 
to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, 
infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio- 
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary Motorsensory 

25 Neuropathy (Charcot-Maxie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following 
additional activities or effects: inhibiting the growth, infection or function of, or killing, 
30 infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; 
effecting (suppressing or enhancing) bodily characteristics, including, without limitation, 
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or 
organ or body part size or shape (such as, for example, breast augmentation or diminution, 
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change in bone form or shape); effecting biorhythms or circadian cycles or rhythms; 
effecting the fertility of male or female subjects; effecting the metabolism, catabolism, 
anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, 
carbohydrate, vitamins, minerals, co-factors or other nutritional factors or component(s); 
5 effecting behavioral characteristics, including, without limitation, appetite, libido, stress, 
cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic 
lineages; hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of 
10 the enzyme and treating deficiency-related diseases; treatment of hyperproliferative 
disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, for 
example, the ability to bind antigens or complement); and the ability to act as an antigen in a 
vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and this 
genetic information can be used to tailor preventive or therapeutic treatment appropriately. 
For example, the existence of a polymorphism associated with a predisposition to 
inflammation or autoimmune disease makes possible the diagnosis of this condition in 

25 humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence of 
the polymorphism in the DNA. For example, PGR may be used to amplify an appropriate 

30 fragment of genomic DNA which may then be sequenced. Alternatively, the DNA may be 
subjected to allele-specific oligonucleotide hybridization (in which appropriate 
oligonucleotides are hybridized to the DNA under conditions permitting detection of a single 
base mismatch) or to a single nucleotide extension assay (in which an oligonucleotide that 
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hybridizes immediately adjacent to the position of the polymorphism is extended with one or 
more labeled nucleotides). In addition, traditional restriction fragment length polymorphism 
analysis (using restriction enzymes that provide differential digestion of the genomic DNA 
depending on the presence or absence of the polymorphism) may be performed. Arrays with 
5 nucleotide sequences of the present invention can be used to detect polymorphisms. The 
array can comprise modified nucleotide sequences of the present invention in order to detect 
the nucleotide sequences of the present invention. In the alternative, any one of the 
nucleotide sequences of the present invention can be placed on the array to detect changes 
from those sequences. 

10 Alternatively a polymorphism resulting in a change in the amino acid sequence could 

also be detected by detecting a corresponding change in amino acid sequence of the protein, 
e.g., by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

1 5 The immunosuppressive effects of the compositions of the invention against 

rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is described 
by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et aL, 1963, Int. Arch. 
Allergy Appl. Immunol, 23:129. Induction of the disease can be caused by a single 

20 injection, generally intradermally, of a suspension of killed Mycobacterium tuberculosis in 
complete Freund's adjuvant (CFA). The route of injection can vary, but rats may be injected 
at the base of the tail with an adjuvant mixture. The polypeptide is administered in phosphate 
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of administering 
PBS only. 

25 The procedure for testing the effects of the test compound would consist of 

intradermally injecting killed Mycobacterium tuberculosis in CFA followed by immediately 
administering the test compound and subsequent treatment every other day until day 24. At 
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an overall arthritis 
score may be obtained as described by J. Holoskitz above. An analysis of the data would 

30 reveal that the test compound would have a dramatic affect on the swelling of the joints as 
measured by a decrease of the arthritis score. 

4.11 THERAPEUTIC METHODS 
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The compositions (including polypeptide fragments, analogs, variants and antibodies 
or other binding partners or modulators including antisense polynucleotides) of the invention 
have numerous applications in a variety of therapeutic methods. Examples of therapeutic 
applications include, but are not limited to, those exemplified herein. 

5 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode 

10 of administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, 
weight, condition and response of the individual patient. Typically, the amount of 

15 polypeptide administered per dose will be in the range of about 0.01p.g/kg to 100 mg/kg of 
body weight, with the preferred dose being about 0. 1 ng/kg to 10 mg/kg of patient body 
weight. For parenteral administration, polypeptides of the invention will be formulated in an 
injectable form combined with a phaimaceutically acceptable parenteral vehicle. Such 
vehicles are well known in the art and examples include water, saline, Ringer's solution, 

20 dextrose solution, and solutions consisting of small amounts of the human serum albumin. 
The vehicle may contain minor amounts of additives that maintain the isotonicity and 
stability of the polypeptide or other active ingredient. The preparation of such solutions is 
within the skill of the art. 

25 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 

ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
including antibodies and other binding partners of the polypeptides of the invention) may be 
30 administered to a patient in need, by itself, or in pharmaceutical compositions where it is 
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other active 
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and other 
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materials well known in the art. The term "pharmaceutical acceptable" means a non-toxic 
material that does not interfere with the effectiveness of the biological activity of the active 
ingredient(s). The characteristics of the carrier will depend on the route of administration. 
The pharmaceutical composition of the invention may also contain cytokines, lymphokines, 
5 or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, 
EL-6, IL-7, DL-8, IL-9, IL-10, IL-11, IL-12, IL-13, DL-14, IL-15, IFN, TNFO, TNF1, TNF2, 
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In further 
compositions, proteins of the invention may be combined with other agents beneficial to the 
treatment of the disease or disorder in question. These agents include various growth factors 
10 such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF), transforming 
growth factors (TGF-oc and TGF-(J), insulin-like growth factor (IGF), as well as cytokines 
described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 

15 use in treatment. Such additional factors and/or agents may be included in the 

pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other active 
ingredient of the present invention may be included in formulations of the particular clotting 
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or antithrombotic 

20 factor, or anti- inflammatory agent to minimize side effects of the clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or 
anti-inflammatory agent (such as DL-lRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, 
immunosuppressive agents). A protein of the present invention may be active in multimers 
(e.g., heterodimers or homodimers) or complexes with itself or other proteins. As a result, 

25 pharmaceutical compositions of the invention may comprise a protein of the invention in 
such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 

30 therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application 
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, 
latest edition. A therapeutically effective dose further refers to that amount of the compound 
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sufficient to result in amelioration of symptoms, e.g., treatment, healing, prevention or 
amelioration of the relevant medical condition, or an increase in rate of treatment, healing, 
prevention or amelioration of such conditions. When applied to an individual active 
ingredient, administered alone, a therapeutically effective dose refers to that ingredient 
5 alone. When applied to a combination, a therapeutically effective dose refers to combined 
amounts of the active ingredients that result in the therapeutic effect, whether administered 
in combination, serially or'simultaneously. 

In practicing the method of treatment or use of the present invention, a 
therapeutically effective amount of protein or other active ingredient of the present invention 

10 is administered to a mammal having a condition to be treated. Protein or other active 

ingredient of the present invention may be administered in accordance with the method of 
the invention either alone or in combination with other therapies such as treatments 
employing cytokines, lymphokines or other hematopoietic factors. When co- administered 
with one or more cytokines, lymphokines or other hematopoietic factors, protein or other 

1 5 active ingredient of the present invention may be administered either simultaneously with 
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or 
anti-thrombotic factors, or sequentially. If administered sequentially, the attending physician 
will decide on the appropriate sequence of administering protein or other active ingredient of 
the present invention in combination with cytokine(s), lymphokine(s), other hematopoietic 

20 factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, 
transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 

25 subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 

intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of protein 
or other active ingredient of the present invention used in the pharmaceutical composition or 
to practice the method of the present invention can be carried out in a variety of conventional 
ways, such as oral ingestion, inhalation, topical application or cutaneous, subcutaneous, 

30 intraperitoneal, parenteral or intravenous injection. Intravenous administration to the patient 
is preferred. 

Alternately, one may administer the compound in a local rather than systemic 
manner, for example, via injection of the compound directly into a arthritic joints or in 
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fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the compounds 
may be administered topically, for example, as eye drops. Furthermore, one may administer 
the drug in a targeted drug delivery system, for example, in a liposome coated with a specific 

5 antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes will be targeted 
to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of skill 

10 in the art. Preferably for wound treatment, one administers the therapeutic compound 
directly to the site. Suitable dosage ranges for the polypeptides of the invention can be 
extrapolated from these dosages or from similar studies in appropriate animal models. 
Dosages can then be adjusted as necessary by the clinician to provide maximal therapeutic 
benefit 

15 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus 
may be formulated in a conventional manner using one or more physiologically acceptable 
carriers comprising excipients and auxiliaries which facilitate processing of the active 

20 compounds into preparations which can be used pharmaceutical^. These pharmaceutical 
compositions may be manufactured in a manner that is itself known, e.g. , by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, 
encapsulating, entrapping or lyophilizing processes. Proper formulation is dependent upon 
the route of administration chosen. When a therapeutically effective amount of protein or 

25 other active ingredient of the present invention is administered orally, protein or other active 
ingredient of the present invention will be in the form of a tablet, capsule, powder, solution 
or elixir. When administered in tablet form, the pharmaceutical composition of the invention 
may additionally contain a solid carrier such as a gelatin or an adjuvant. The tablet, capsule, 
and powder contain from about 5 to 95% protein or other active ingredient of the present 

30 invention, and preferably from about 25 to 90% protein or other active ingredient of the 
present invention. When administered in liquid form, a liquid carrier such as water, 
petroleum, oils of animal or plant origin such as peanut oil, mineral oil, soybean oil, or 
sesame oil, or synthetic oils may be added. The liquid form of the pharmaceutical 
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composition may further contain physiological saline solution, dextrose or other saccharide 
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. When 
a&ninistered in liquid form, the pharmaceutical composition contains from about 0.5 to 90% 
by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, 
protein or other active ingredient of the present invention will be in the form of a 
pyrogen-free, parenterally acceptable aqueous solution. The preparation of such parenterally 
1 0 acceptable protein or other active ingredient solutions, having due regard to pH, isotonicity, 
stability, and the like, is within the skill in the art. A prefeiTed pharmaceutical composition 
for intravenous, cutaneous, or subcutaneous injection should contain, in addition to protein 
or other active ingredient of the present invention, an isotonic vehicle such as Sodium 
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and Sodium Chloride 
1 5 Injection, Lactated Ringer's Injection, or other vehicle as known in the art. The 
pharmaceutical composition of the present invention may also contain stabilizers, 
preservatives, buffers, antioxidants, or other additives known to those of skill in the art. For 
injection, the agents of the invention may be formulated in aqueous solutions, preferably in 
physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
20 physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in 
the art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutical^ acceptable carriers well known in the art. Such 
25 carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a 
patient to be treated. Pharmaceutical preparations for oral use can be obtained from a solid 
excipient, optionally grinding a resulting mixture, and processing the mixture of granules, 
after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, 
potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, 
sodium caiboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, 



30 
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disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or 
alginic acid or a salt thereof such as sodium alginate. Dragee cores are provided with 
suitable coatings. For this purpose, concentrated sugar solutions may be used, which may 
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, 
5 and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to 
characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made 
of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol 

10 or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler 
such as lactose, binders such as starches, and/or lubricants such as talc or magnesium 
stearate and, optionally, stabilizers. In soft capsules, the active compounds maybe dissolved 
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene 
glycols. In addition, stabilizers may be added. All formulations for oral administration 

15 should be in dosages suitable for such administration. For buccal administration, the 

compositions may take the form of tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

20 dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide 
or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined 
by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin 
for use in an inhaler or insufflator may be formulated containing a powder mix of the 
compound and a suitable powder base such as lactose or starch. The compounds may be 

25 formulated for parenteral administration by injection, by bolus injection or continuous 
infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampules 
or in multi-dose containers, with an added preservative. The compositions may take such 
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain 
formulatory agents such as suspending, stabilizing and/or dispersing agents. 

30 Pharmaceutical formulations for parenteral administration include aqueous solutions 

of the active compounds in water-soluble form. Additionally, suspensions of the active 
compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic 
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such 
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as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable 
stabilizers or agents which increase the solubility of the compounds to allow for the 
5 preparation of highly concentrated solutions. Alternatively, the active ingredient may be in 
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before 
use. 

The compounds may also be formulated in rectal compositions such as suppositories 
or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or 

10 other glycerides. In addition to the formulations described previously, the compounds may 
also be formulated as a depot preparation. Such long acting formulations may be 
administered by implantation (for example subcutaneously or intramuscularly) or by 
intramuscular injection. Thus, for example, the compounds may be formulated with suitable 
polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion 

15 exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic 
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent system. 
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 

20 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD 
co-solvent system (VPD:5W) consists of VPD diluted 1 : 1 with a 5% dextrose in water 
solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces 
low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system 
may be varied considerably without destroying its solubility and toxicity characteristics. 

25 Furthermore, the identity of the co-solvent components may be varied: for example, other 
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the fraction size of 
polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene 
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical compounds 

30 may be employed. Liposomes and emulsions are well known examples of delivery vehicles 
or carriers for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also 
may be employed, although usually at the cost of greater toxicity. Additionally, the 
compounds may be delivered using a sustained-release system, such as semipermeable 
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matrices of solid hydrophobic polymers containing the therapeutic agent. Various types of 
sustained-release materials have been established and are well known by those skilled in the 
art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
5 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited to 
calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 

10 gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 
invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 

15 trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 
potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a complex of 
the piotein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

20 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) 
following presentation of the antigen by MHC proteins. MHC and structurally related 
proteins including those encoded by class I and class H MHC genes on host cells will serve 
to present the peptide antigen(s) to T lymphocytes. The antigen components could also be 

25 supplied as purified MHC-peptide complexes alone or with co-stimulatory molecules that 
can directly signal T cells. Alternatively antibodies able to bind surface immunoglobulin 
and other molecules on B cells as well as antibodies able to bind the TCR and other 
molecules on T cells can be combined with the pharmaceutical composition of the invention. 
The pharmaceutical composition of the invention may be in the form of a liposome in 

30 which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. 
Suitable lipids for liposomal formulation include, without limitation, monoglycerides, 
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diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. 
Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, 
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of 
which are incorporated herein by reference. 
5 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 
patient has undergone. Ultimately, the attending physician will decide the amount of protein 
or other active ingredient of the present invention with which to treat each individual patient. 

10 Initially, the attending physician will administer low doses of protein or other active . 
ingredient of the present invention and observe the patient's response. Larger doses of 
protein or other active ingredient of the present invention may be administered until the 
optimal therapeutic effect is obtained for the patient, and at that point the dosage is not 
increased further. It is contemplated that the various pharmaceutical compositions used to 

15 practice the method of the present invention should contain about 0.01 \ig to about 100 mg 
(preferably about 0.1 \xg to about 10 mg, more preferably about 0.1 \ig to about 1 mg) of 
protein or other active ingredient of the present invention per kg body weight. For 
compositions of the present invention which are useful for bone, cartilage, tendon or 
ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the 
therapeutic composition for use in this invention is, of course, in a pyrogen-free, 
physiologically acceptable form. Further, the composition may desirably be encapsulated or 
injected in a viscous form for delivery to the site of bone, cartilage or tissue damage. 
Topical administration may be suitable for wound healing and tissue repair. Therapeutically 

25 useful agents other than a protein or other active ingredient of the invention which may also 
optionally be included in the composition as described above, may alternatively or 
additionally, be administered simultaneously or sequentially with the composition in the 
methods of the invention. Preferably for bone and/or cartilage formation, the composition 
would include a matrix capable of delivering the protein-containing or other active 

30 ingredient-containing composition to the site of bone and/or cartilage damage, providing a 
structure for the developing bone and cartilage and optimally capable of being resorbed into 
the body. Such matrices may be formed of materials presently in use for other implanted 
medical applications. 
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The choice of matrix material is based on biocompatibility, biodegradability, 
mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential matrices 
for the compositions may be biodegradable and chemically defined calcium sulfate, 
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. 
Other potential materials are biodegradable and biologically well-defined, such as bone or 
dermal collagen. Further matrices are comprised of pure proteins or extracellular matrix 
components. Other potential matrices are nonbiodegradable and chemically defined, such as 
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may be comprised 

10 of combinations of any of the above-mentioned types of material, such as polylactic acid and 
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be altered in 
composition, such as in calcium-aluminate-phosphate and processing to alter pore size, 
particle size, particle shape, and biodegradability. Presently preferred is a 50:50 (mole 
weight) copolymer of lactic acid and glycolic acid in the form of porous particles having 

15 diameters ranging from 1 50 to 800 microns. In some applications, it will be useful to utilize 
a sequestering agent, such as carboxymethyl cellulose or autologous blood clot, to prevent 
the protein compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 

hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, polyethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful 

25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein the 
opportunity to assist the osteogenic activity of the progenitor cells. In further compositions, 

30 proteins or other active ingredients of the invention may be combined with other agents 
beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in question. 
These agents include various growth factors such as epidermal growth factor (EGF), platelet 
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derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
5 patients for such treatment with proteins or other active ingredients of the present invention. 
The dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site 
of damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue 

10 (e.g., bone), the patients age, sex, and diet, the severity of any infection, time of 

administration and other clinical factors. The dosage may vary with the type of matrix used 
in the reconstitution and with inclusion of other proteins in the pharmaceutical composition. 
For example, the addition of other known growth factors, such as IGF I (insulin like growth 
factor I), to the final composition, may also effect the dosage. Progress can be monitored by 

1 5 periodic assessment of tissue/bone growth and/or repair, for example, X-rays, 
histomorphometric determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject Polynucleotides of the invention may also be administered by other 

20 known methods for introduction of nucleic acid into a cell or organism (including, without 
limitation, in the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in 
the presence of proteins of the present invention in order to proliferate or to produce a 
desired effect on or activity in such cells. Treated cells can then be introduced in vivo for 
therapeutic purposes. 

25 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve 
its intended purpose. More specifically, a therapeutically effective amount means an amount 
30 effective to prevent development of or to alleviate the existing symptoms of the subject 
being treated. Determination of the effective amount is well within the capability of those 
skilled in the art, especially in light of the detailed disclosure provided herein. For any 
compound used in the method of the invention, the therapeutically effective dose can be 
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estimated initially from appropriate in vitro assays. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that can be used to more 
accurately determine useful doses in humans. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that includes the IC50 as 
5 determined in cell culture (/.<?., the concentration of the test compound which achieves a 
half-maximal inhibition of the protein's biological activity). Such information can be used 
to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient Toxicity and therapeutic 

10 efficacy of such compounds can be determined by standard pharmaceutical procedures in 
cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% 
of the population) and the ED 50 (the dose therapeutically effective in 50% of the population). 
The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be 
expressed as the ratio between LD 50 and ED 5 o- Compounds which exhibit high therapeutic 

15 indices are preferred. The data obtained from these cell culture assays and animal studies 
can be used in formulating a range of dosage for use in human. The dosage of such 
compounds lies preferably within a range of circulating concentrations that include the ED50 
with little or no toxicity. The dosage may vary within this range depending upon the dosage 
form employed and the route of administration utilized. The exact formulation, route of 

20 administration and dosage can be chosen by the individual physician in view of the patient's 
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics' 1 , Ch. 
1 p.l. Dosage amount and interval may be adjusted individually to provide plasma levels of 
the active moiety which are sufficient to maintain the desired effects, or minimal effective 
concentration (MEC). The MEC will vary for each compound but can be estimated from in 

25 vitro data. Dosages necessary to achieve the MEC will depend on individual characteristics 
and route of administration. However, HPLC assays or bioassays can be used to determine 
plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of 

30 the time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
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An exemplary dosage regimen for polypeptides or other compositions of the 
invention will be in the range of about 0.01 \xg/kg to 1 00 mg/kg of body weight daily, with 
the preferred dose being about 0.1 jig/kg to 25 mg/kg of patient body weight daily, varying 
in adults and children. Dosing may be once daily, or equivalent doses may be delivered at 
5 longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

10 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which 
may contain one or more unit dosage forms containing the active ingredient. The pack may, 
for example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser 
device may be accompanied by instructions for administration. Compositions comprising a 
1 5 compound of the invention formulated in a compatible pharmaceutical carrier may also be 
prepared, placed in an appropriate container, and labeled for treatment of an indicated 
condition. 

4.13 ANTIBODIES 

20 Also included in the invention are antibodies to proteins, or fragments of proteins of 

the invention. The term "antibody 1 * as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen-binding site that specifically binds (immunoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

25 F^, Fab- and F (ab ')2 fragments, and an Fab expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ 
from one another by the nature of the heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgG2, and others. Furthermore, in humans, the light 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a 

30 reference to all such classes, subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or 
a portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for 
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polyclonal and monoclonal antibody preparation. The full-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 
amino acid sequence of the full length protein, such as an amino acid sequence shown in 

5 SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8, and encompasses an epitope 
thereof such that an antibody raised against the peptide forms a specific immune complex 
with the fall length protein or with any fragment that contains the epitope. Preferably, the 
antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

1 0 epitopes encompassed by the antigenic peptide are regions of the protein that are located on 
its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A 
hydrophobicity analysis of the human related protein sequence will indicate which regions of 

15 a related protein are particularly hydrophilic and, therefore, are likely to encode surface 
residues useful for targeting antibody production. As a means for targeting antibody 
production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be 
generated by any method well known in the art, including, for example, the Kyte Doolittle or 
the Hopp Woods methods, either with or without Fourier transformation. See, e.g., Hopp and 

20 Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol. 
Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

25 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (ue. 9 able to 
distinguish the polypeptide of the invention from other similar polypeptides despite sequence 

30 identity, homology, or similarity found in the family of polypeptides), but may also interact 
with other proteins (for example, S. aureus protein A or other antibodies in ELISA 
techniques) through interactions with sequences outside the variable region of the antibodies, 
and in particular, in the constant region of the molecule. Screening assays to determine 
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binding specificity of an antibody of the invention are well known and routinely practiced in 
the art. For a comprehensive discussion of such assays, see Harlow et al. (Eds), Antibodies 
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY (1988), 
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of the 
5 invention are also contemplated, provided that the antibodies are first and foremost specific 
for, as defined above, full-length polypeptides of the invention. As with antibodies that are 
specific for full length polypeptides of the invention, antibodies of the invention that 
recognize fragments are those which can distinguish polypeptides from the same family of 
polypeptides despite inherent sequence identity, homology, or similarity found in the family 
10 of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 
modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
invention. Kits comprising an antibody of the invention for any of the purposes described 

15 herein are also comprehended. In general, a kit of the invention also includes a control 
antigen for which the antibody is immunospecific. The invention further provides a 
hybridoma that produces an antibody according to the invention. Antibodies of the 
invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 

20 diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 
abnormal expression of the protein is involved. In the case of cancerous cells or leukemic 
cells, neutralizing monoclonal antibodies against the protein may be useful in detecting and 

25 preventing the metastatic spread of the cancerous cells, which may be mediated by the 
protein. 

The labeled antibodies of the present invention can be used for in vitro, in v/vo, and 
in situ assays to identify cells or tissues in which a fragment of the polypeptide of interest is 
expressed. The antibodies may also be used directly in therapies or other diagnostics. The 
30 present invention further provides the above-described antibodies immobilized on a solid 
support. Examples of such solid supports include plastics such as polycarbonate, complex 
carbohydrates such as agarose and Sepharose®, acrylic resins and such as polyacrylamide 
and latex beads. Techniques for coupling antibodies to such solid supports are well known 
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in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed., BlackweU 
Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, WD. et al., Meth. 
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the present 
invention can be used for in vitro, in vivo, and in situ assays as well as for immuno-affinity 
purification of the proteins of the present invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: 
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are 
discussed below. 



15 



20 



25 



50 



4.13.1 POLYCLONAL ANTIBODIES 

For the production of polyclonal antibodies, various suitable host animals (e.g., 
rabbit, goat, mouse or other mammal) maybe immunized by one or more injections with the 
native protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein maybe conjugated 
to a second protein known to be immunogenic in the mammal being immunized. Examples 
of such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, 
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can 
further include an adjuvant. Various adjuvants used to increase the immunological response 
include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., 
aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as 
Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory 
agents. Additional examples of adjuvants that can be employed include MPL-TDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
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antigen which is the target of fteroinste^&t^ sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
5 (April 17, 2000), pp. 25-28). 

4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 

10 species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) 
of the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen-binding site capable of immunoreacting with a particular epitope of the 
antigen characterized by a unique binding affinity for it. 

15 Monoclonal antibodies can be prepared using hybridoma methods, such as those 

described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies 
that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be 

20 immunized in vitro . 

The immunizing agent will typically include the protein antigen, a fragment thereof 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells 
of human origin are desired, or spleen cells or lymph node cells are used if non-human 
mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 

25 line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59- 
103). Immortalized cell lines are usually transformed mammalian cells, particularly 
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell 
lines are employed. The hybridoma cells can be cultured in a suitable culture medium that 

30 preferably contains one or more substances that inhibit the growth or survival of the unfused, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas 
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typically will include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
5 medium such as HAT medium. More preferred immortalized cell lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, 
• San Diego, California and the American Type Culture Collection, Manassas, Virginia. 
Human myeloma and mouse-human heteromyeloma cell lines also have been described for 
the production of human monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); 

10 Brodeur et aL, Monoclonal Antibody Production Techniques and Applications, Marcel 
Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed 
for the presence of monoclonal antibodies directed against the antigen. Preferably, the 
binding specificity of monoclonal antibodies produced by the hybridoma cells is determined 

15 by immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELIS A). Such techniques and assays are known in 
the art. The binding affinity of the monoclonal antibody can, for example, be determined by 
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target 

20 antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for this 
purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

25 The monoclonal antibodies secreted by the subclones can be isolated or purified from 

the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel 
electrophoresis, dialysis, or affinity chromatography, 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 

30 those described in U.S. Patent No. 4,8 16,567. DNA encoding the monoclonal antibodies of 
the invention can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as 
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a preferred source of such DNA. Once isolated, the DNA can be placed into expression 
vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster 
ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, 
to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA 
5 also can be modified, for example, by substituting the coding sequence for human heavy and 
light chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non- 
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted 
10 for the constant domains of an antibody of the invention, or can be substituted for the 

variable domains of one antigen-combining site of an antibody of the invention to create a 
chimeric bivalent antibody. 

4.13.3 HUMANIZED ANTIBODIES 

15 The antibodies directed against the protein antigens of the invention can further 

comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 

20 F(ab')2 or other antigen-binding subsequences of antibodies) that are principally comprised 
of the sequence of a human immunoglobulin, and contain minimal sequence derived from a 
non-human immunoglobulin. Humanization can be performed following the method of 
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et al., Nature, 
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by substituting 

25 rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. (See 
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
can also comprise residues that are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, the humanized antibody will comprise 

30 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework regions are those of a human immunoglobulin 
consensus sequence. The humanized antibody optimally also will comprise at least a portion 
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of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin 
(Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol., 2, 593-596 
(1992)). 

5 4.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from 
human genes. Such antibodies are termed "human antibodies", or "fully human antibodies" 
herein. Human monoclonal antibodies can be prepared by the trioma technique; the human 

10 B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80, 

15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et 
aL, 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol, 227, 381 (1991); 
Marks et al, J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 

20 introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 

25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779- 
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature 368, 812-13 

(1994) ); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger (Nature 
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 13, 65-93 

(1995) ). 

30 Human antibodies may additionally be produced using transgenic nonhuman animals 

that are modified so as to produce fully human antibodies rather than the animal's 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602). The endogenous genes encoding the heavy and light immunoglobulin chains 
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in the nonhuman host have been incapacitated, and active loci encoding human heavy and 
light chain immunoglobulins are inserted into the host's genome. The human genes are 
incorporated, for example, using yeast artificial chromosomes containing the requisite 
human DNA segments. An animal which provides all the desired modifications is then 
5 obtained as progeny by crossbreeding intermediate transgenic animals containing fewer than 
the foil complement of the modifications. The preferred embodiment of such a nonhuman 
animal is a mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 
96/33735 and WO 96/34096. This animal produces B cells that secrete folly human 
immunoglobulins. The antibodies can be obtained directly from the animal after 

1 0 immunization with an immunogen of interest, as, for example, a preparation of a polyclonal 
antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 

1 5 example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment genes 
from at least one endogenous heavy chain locus in an embryonic stem cell to prevent 

20 rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 
containing a gene encoding a selectable marker; and producing from the embryonic stem cell 
a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable 
marker. 

25 A method for producing an antibody of interest, such as a human antibody, is 

disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
culture, introducing an expression vector containing a nucleotide sequence encoding a light 
chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 

30 hybrid cell expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically 
relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
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binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

5 According to the invention, techniques can be adapted for the production of 

single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of F ab expression 
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid and effective 
identification of monoclonal F a b fragments with the desired specificity for a protein or 

10 derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F^ fragment produced by pepsin digestion of an antibody molecule; 
(ii) an Fab fragment generated by reducing the disulfide bridges of an F( a v)2 fragment; (iii) an 
Fab fragment generated by the treatment of the antibody molecule with papain and a reducing 

1 5 agent and (iv) F v fragments. 

4.13.6 BISPECIFIC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
that have binding specificities for at least two different antigens. In the present case, one of 
20 the binding specificities is for an antigenic protein of the invention. The second binding 
target is any other antigen, and advantageously is a cell-surface protein or receptor or 
receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

25 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture of ten different antibody molecules, of which only one has the 
correct bispecific structure. The purification of the correct molecule is usually accomplished 

30 by affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, 
published 13 May 1993, and in Traunecker etal y 1991 EMBOJ. y 10, 3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 
of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant 
region (CHI) containing the site necessary for light-chain binding present in at least one of 
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the 
5 immunoglobulin light chain, are inserted into separate expression vectors, and are co- 
transfected into a suitable host organism. For further details of generating bispecific 
antibodies see, for example, Suresh et al, Methods in Enzymology, 121, 210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a 
pair of antibody molecules can be engineered to maximize the percentage of heterodimers 

10 that are recovered from recombinant cell culture. The preferred interface comprises at least 
a part of the CH3 region of an antibody constant domain. In this method, one or more small 
amino acid side chains from the interface of the first antibody molecule are replaced with 
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or 
similar size to the large side chain(s) are created on the interface of the second antibody 

1 5 molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or 

threonine). This provides a mechanism for increasing the yield of the heterodimer over other 
unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full-length antibodies or antibody fragments 
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from 

20 antibody fragments have been described in the literature. For example, bispecific antibodies 
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985) describe a 
procedure wherein intact antibodies are proteolytically cleaved to generate F(ab')2 
fragments. These fragments are reduced in the presence of the dithiol complexing agent 
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular disulfide formation. 

25 The Fab' fragments generated are then converted to thionitrobenzoate (TNB) derivatives. 
One of the Fab' -TNB derivatives is then reconverted to the Fab'-thiol by reduction with 
mercaptoethylamine and is mixed with an equimolar amount of the other Fab '-TNB 
derivative to form the bispecific antibody. The bispecific antibodies produced can be used 
as agents for the selective immobilization of enzymes. 

30 Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med, 175, 217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each 
Fab' fragment was separately secreted from E. coli and subjected to directed chemical 
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coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well as 
trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets. 
Various techniques for making and isolating bispecific antibody fragments directly 
5 from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5), 1547-1553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab* 
portions of two different antibodies by gene fusion. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody 

1 0 heterodimers. This method can also be utilized for the production of antibody homodimers. 
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90, 
6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody 
fragments. The fragments comprise a heavy-chain variable domain (V H ) connected to a 
light-chain variable domain (V L ) by a linker which is too short to allow pairing between the 

1 5 two domains on the same chain. Accordingly, the V H and V L domains of one fragment are 
forced to pair with the complementary V L and V H domains of another fragment, thereby 
forming two antigen-binding sites. Another strategy for making bispecific antibody 
fragments by the use of single-chain Fv (sFv) dimers has also been reported. See, Gruber et 
al., J. Immunol. 152, 5368 (1994). 

20 Antibodies with more than two valencies are contemplated. For example, trispecific 

antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm 
of an immunoglobulin molecule can be combined with an arm which binds to a triggering 

25 molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), 
or Fc receptors for IgG (Fc^R), such as FcryRI (CD64), FcyRII (CD32) and FcryRIE (CD16) 
so as to focus cellular defense mechanisms to the cell expressing the particular antigen. 
Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a 
particular antigen. These antibodies possess an antigen-binding arm and an arm which binds 

30 a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOT A, or TETA. 
Another bispecific antibody of interest binds the protein antigen described herein and further 
binds tissue factor (TF). 
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4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 
5 (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 
92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro vising 
known methods in synthetic protein chemistry, including those involving crosslinking 
agents. For example, immunotoxins can be constructed using a disulfide exchange reaction 
or by forming a thioether bond. Examples of suitable reagents for this purpose include 
10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. 
Patent No. 4,676,980. 

4.13.8 EFFECTOR FUNCTION ENGINEERING 

It can be desirable to modify the antibody of the invention with respect to effector 
15 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron et 
20 al., J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Immunol, 148, 2918-2922 (1992). 
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared using 
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53, 2560- 
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc regions and can 
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et al., 
25 Anti-Cancer Drug Design, 3, 219-230 (1989). 

4.13.9 IMMUNOCONJUGATES 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive 
isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
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include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins 
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria 
5 officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the 

tricothecenes. A variety of radionuclides are available for the production of radioconjugated 
antibodies. Examples include 2l2 Bi, 13l I, 13l In, 90 Y, and l86 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
bifunctional protein-coupling agents such as N-succinimidyl~3-(2-pyridyldithiol) propionate 

10 (SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 
diazonium derivatives (such as bis-(p-diazoniumbeiizoyl)-ethylenediamine), diisocyanates 
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1,5-difluoro- 

1 5 2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in 
Vitetta et al., Science, 238: 1098 (1987). Carbon- 14-labeled l-isothiocyanatobenzyl-3- 
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for 
conjugation of radionucleotide to the antibody. See W094/1 1 026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 

20 streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

25 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention 
can be recorded on computer readable media. As used herein, "computer readable media" 
refers to any medium which can be read and accessed directly by a computer. Such media 
30 include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. A skilled artisan can readily appreciate how any of the 
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presently known computer readable mediums can be used to create a manufacture 
comprising computer readable medium having recorded thereon a nucleotide sequence of the 
present invention. As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently known 
5 methods for recording information on computer readable medium to generate manufactures 
comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 

1 0 chosen to access the stored information. In addition, a variety of data processor programs 
and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented in a 
word processing text file, formatted in commercially-available software such as WordPerfect 
and Microsoft Word, or represented in the form of an ASCII file, stored in a database 

15 application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any 
number of data processor structuring formats (e.g. text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence information of 
the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 or 

20 a representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 in computer readable form, a 
skilled artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 

25 demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 

215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search 
algorithms on a Sybase system is used to identify open reading frames (ORFs) within a 
nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be useful in 
producing commercially important proteins such as enzymes used in fermentation reactions 

30 and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the 
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present invention comprises a central processing unit (CPU), input means, output means, and 
data storage means. A skilled artisan can readily appreciate that any one of the currently 
available computer-based systems are suitable for use in the present invention. As stated 
above, the computer-based systems of the present invention comprise a data storage means 
5 having stored therein a nucleotide sequence of the present invention and the necessary 
hardware means and software means for supporting and implementing a search means. As 
used herein, "data storage means" refers to memory which can store nucleotide sequence 
information of the present invention, or a memory access means which can access 
manufactures having recorded thereon the nucleotide sequence information of the present 
10 invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target structural 
motif with the sequence information stored within the data storage means. Search means are 
used to identify fragments or regions of a known sequence which match a particular target 

15 sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the 
computer-based systems of the present invention. Examples of such software includes, but 
is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA 
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the available 

20 algorithms or implementing software packages for conducting homology searches can be 
adapted for use in the present computer-based systems. As used herein, a "target sequence" 
can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more 
amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the 
less likely a target sequence will be present as a random occurrence in the database. The 

25 most preferred sequence length of a target sequence is from about 10 to 300 amino acids, 
more preferably from about 30 to 100 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in 
gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 

30 selected sequence or combination of sequences in which the sequence(s) are chosen based on 
a three-dimensional configuration which is formed upon the folding of the target motif. 
There are a variety of target motifs known in the art. Protein target motifs include, but are 
not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, 
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but are not limited to, promoter sequences, hairpin structures and inducible expression 
elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be used 

to control gene expression through triple helix formation or antisense DNA or RNA, both of 
which methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and 
are designed to be complementary to a region of the gene involved in transcription (triple 

10 helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 15241, 456 
(1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense- 
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 

1 5 blocks translation of an mRNA molecule into polypeptide. Both techniques have been 

demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide. 

20 4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, using a 
nucleic acid probe or antibodies of the present invention, optionally conjugated or otherwise 
associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention under 
such conditions, and amplifying annealed polynucleotides, so that if a polynucleotide is 
amplified, a polynucleotide of the invention is detected in the sample. 
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In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
polypeptide for a period sufficient to form the complex, and detecting the complex, so that if 
a complex is detected, a polypeptide of the invention is detected in the sample. 
5 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

10 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. 
One skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic 
acid probes or antibodies of the present invention. Examples of such assays can be found in 
Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science 

1 5 Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et aL, Techniques in 

Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The 
Netherlands (1985). The test samples of the present invention include cells, protein or 

20 membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
urine. The test sample used in the above-described method will vary based on the assay 
format, nature of the detection method and the tissues, cells or extracts used as the sample to 
be assayed. Methods for preparing protein extracts or membrane extracts of cells are well 
known in the art and can be readily be adapted in order to obtain a sample which is 

25 compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the 
invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or antibodies 

30 of the present invention; and (b) one or more other containers comprising one or more of the 
following: wash reagents, reagents capable of detecting presence of a bound probe or 
antibody. 
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In detail, a compartment kit includes any kit in which reagents are contained in 
separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
5 cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 
container which will accept the test sample, a container which contains the antibodies used 
in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound 

10 antibody or probe. Types of detection reagents include labeled nucleic acid probes, labeled 
secondary antibodies, or in the alternative, if the primary antibody is labeled, the enzymatic, 
or antibody binding reagents which are capable of reacting with the labeled antibody. One 
skilled in the art will readily recognize that the disclosed probes and antibodies of the present 
invention can be readily incorporated into one of the established kit formats which are well 

15 known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useftd in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
20 invention is involved in the immune response, for imaging sites of inflammation or 
infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve 
chemical attachment of a labeling or imaging agent, administration of the labeled 
polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging the labeled 
polypeptide in vivo at the target site. 

25 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present 
invention further provides methods of obtaining and identifying agents which bind to a 
polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set forth 
30 in SEQ ID NO: 1-1041, or 2083-2534, or bind to a specific domain of the polypeptide 
encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
present invention, or nucleic acid of the invention; and 
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(b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide 
of the invention for a time sufficient to form apolynucleotide/compound complex, and 
5 detecting the complex, so that if a polynucleotide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to 
a polypeptide of the invention can comprise contacting a compound with a polypeptide of 
the invention for a time sufficient to form a polypeptide/compound complex, and detecting 
10 the complex, so that if a polypeptide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can 
also comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression 
15 of a receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound 
that binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
20 activity observed in the absence of the compound). Alternatively, compounds identified via 
such methods can include compounds which modulate the expression of a polynucleotide of 
the invention (that is, increase or decrease expression relative to expression levels observed 
in the absence of the compound). Compounds, such as compounds identified via the 
methods of the invention, can be tested using standard assays well known to those of skill in 
25 the art for their ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

30 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents 

and the like are selected at random and are assayed for their ability to bind to the protein 
encoded by the ORF of the present invention. Alternatively, agents may be rationally 
selected or designed. As used herein, an agent is said to be "rationally selected or designed" 
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when the agent is chosen based on the configuration of the particular protein. For example, 
one skilled in the art can readily adapt currently available procedures to generate peptides, 
pharmaceutical agents and the like, capable of binding to a specific peptide sequence, in 
order to generate rationally designed antipeptide peptides, for example see Hurby et al., 

5 Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W.H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al., Biochemistry 
28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or 

10 EMFs of the present invention. As described above, such agents can be randomly screened 
or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single 
ORF or multiple ORFs which rely on the same EMF for expression control. One class of 
DNA binding agents are agents which contain base residues which hybridize or form a triple 

1 5 helix formation by binding to DNA or RNA. Such agents can be based on the classic 

phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric 
derivatives which have base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - 

20 see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241, 456 (1988); and 
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Okano, J. 
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene 
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in 
a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks 

25 translation of an mRNA molecule into polypeptide. Both techniques have been 

demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention 

30 can be used as a diagnostic agent. Agents which bind to a protein encoded by one of the 
ORFs of the present invention can be formulated using known techniques to generate a 
pharmaceutical composition. 
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4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic 
acid hybridization probes capable of hybridizing with naturally occurring nucleotide 
sequences. The hybridization probes of the subject invention may be derived from any of 

5 the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 can be used as an 
indicator of the presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 

10 hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 

additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used 
in PCR maybe of recombinant origin, may be chemically synthesized, or a mixture of both. 
The probe will comprise a discrete nucleotide sequence for the detection of identical 
sequences or a degenerate pool of possible sequences for identification of closely related 

15 genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such 
vectors are known in the art and are commercially available and may be used to synthesize 
RNA probes in vitro by means of the addition of the appropriate RNA polymerase as 17 or 

20 SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The nucleotide 
sequences may be used to construct hybridization probes for mapping their respective 
genomic sequences. The nucleotide sequence provided herein may be mapped to a . 
chromosome or specific regions of a chromosome using well-known genetic and/or 
chromosomal mapping techniques. These techniques include in situ hybridization, linkage 

25 analysis against known chromosomal markers, hybridization screening with libraries or 
flow-sorted chromosomal preparations specific to known chromosomes, and the like. The 
technique of fluorescent in situ hybridization of chromosome spreads has been described, 
among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic 
Techniques, Pergamon Press, New York NY. 

30 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265:1981f). Correlation between the location of a nucleic acid on a physical chromosomal 
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map and a specific disease (or predisposition to a specific disease) may help delimit the 
region of DNA associated with that genetic disease. The nucleotide sequences of the subject 
invention may be used to detect differences in gene sequences between normal, carrier or 
affected individuals. 

5 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly 
practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those 

10 of skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy 
is to precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can 
be achieved using passive adsorption (fciouye & Hondo, (1990) J. Clin. Microbiol. 28(6), 1469- 
72); using UV light (Nagata et al, 1985; Dahlen et al, 1987; Moirissey & Collins, (1989) Mol. 
Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (Keller et al, 1988; 

15 1989); all references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1994) Proc. Natl. Acad. Sci. USA 91(8), 
3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 

20 purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (NaperviUe, IL) is also selling suitable material that could be used. 
Nunc Laboratories have developed a method by which DNA can be covalently bound to the 

25 microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with 
secondary amino groups (>NH) that serve as bridgeheads for further covalent coupling. 
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be bound 
to CovaLink exclusively at the 5'-end by a phosphoramidate bond, allowing immobilization of 
more than 1 pmol of DNA (Rasmussen et al, (1991) Anal. Biochem. 198(1) 138-42). 

30 The use of CovaLink NHstrips for covalent binding of DNA molecules at the 5 -end 

has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is 
employed (Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred. The phosphoramidate bond joins 
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the DNA to the CovaLink NH secondary amino groups that are positioned at the end of spacer 
arms covalently grafted onto the polystyrene surface through a.2 nm long spacer arm. To link 
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus 
must have a 5'-end phosphate group. It is, perhaps, even possible for biotin to be covalently 
5 bound to CovaLink and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/\xl) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1- 
methylimidazole, pH 7.0 (1-Mehn 7 ), is then added to a final concentration of 10 mM 1-Mehn 7 . 
A ss DNA solution is then dispensed into CovaLink NH strips (75 fil/well) standing on ice. 

10 Carbodiimide 0.2 M l-ethyl-3-(3"dimethylaminopropyl>carbodiimide (EDC), 

dissolved in 10 mM 1-Melm 7 , is made fresh and 25 pJ added per well. The strips are incubated 
for 5 hours at 50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; 
first the wells are washed 3 times, then they are soaked with washing solution for 5 min., and 
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS 

15 heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 
herein by reference. This method of preparing an oligonucleotide bound to a support involves 
attaching a nucleoside 3 -reagent through the phosphate group by a covalent phosphodiester link 

20 to aliphatic hydroxyl groups carried by the support. The oligonucleotide is then synthesized on 
the supported nucleoside and protecting groups removed from the synthetic oligonucleotide 
chain under standard conditions that do not cleave the oligonucleotide from the support. 
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 

25 arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described 
by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by reference. Probes 
may also be immobilized on nylon supports as described by Van Ness et al (1991) Nucleic 
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) 

30 Anal. Biochem. 169(1), 104-8; all references being specifically incorporated herein. 

To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5 ! -amine of 
oligonucleotides with cyanuric chloride. 
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One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1994) Proc. Natl Acad. Sci., USA 91(1 1), 
5022-6, incorporated herein by reference). These authors used current photolithographic 
techniques to generate arrays of immobilized oligonucleotide probes (DNA chips). These 
5 methods, in which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5'-protected iV-acyl-deoxynucleoside phosphoramidites, 
surface linker chemistry and versatile combinatorial synthesis strategies. A matrix of 256 
spatially defined oligonucleotide probes may be generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

1 0 The nucleic acids may be obtained from any appropriate source, such as cDNAs, 

genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 

inserts, and RNA, including mRNA without any amplification steps. For example, Sambrook 

et al (1989) describes three protocols for the isolation of high molecular weight DNA from 

mammalian cells (p. 9.14-9.23). 
15 DNA fragments may be prepared as clones in M13, plasmid or lambda vectors and/or 

prepared directly from genomic DNA or cDNA by PCR or other amplification methods. 

Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA 

samples maybe prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of 
20 skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 of 

Sambrook et al (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990) 

Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this method, DNA 

samples are passed through a small French pressure cell at a variety of low to intermediate 
25 pressures. A lever device allows controlled application of low to intermediate pressures to the 

cell The results of these studies indicate that low-pressure shearing is a useful alternative to 

sonic, and enzymatic DNA fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the 

two base recognition endonuclease, Cv/JT, described by Fitzgerald et al (1992) Nucleic Acids 
30 Res. 20( 14) 3753-62. These authors described an approach for the rapid fragmentation and 

fractionation of DNA into particular sizes that they contemplated to be suitable for shotgun 

cloning and sequencing. 
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The restriction endonuclease CWJI normally cleaves Hie recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the 
specificity of this enzyme (CvzJI**), yield a quasi-random distribution of DNA fragments form 
the small molecule pUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated 
5 the randomness of this fragmentation strategy, using a CwJI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z 
minus M13 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts 
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated 
at a rate consistent with random fragmentation. 

1 0 As reported in the literature, advantages of this approach compared to sonicajion and 

agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 fig instead of 
2-5 jxg); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed). 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, 

15 it is important to denature the DNA to give single stranded pieces available for hybridization. 
This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is 
then cooled quickly to 2°C to prevent renaturation of the DNA fragments before they are 
contacted with the chip. Phosphate groups must also be removed from genomic DNA by 
methods known in the art. 

20 4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 
membrane. Spotting may be performed by using arrays of metal pins ^the positions c^f which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a 
DNA solution to a nylon membrane. By offset printing, a density of dots higher than the density 

25 of the wells is achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the 
type of label used. By avoiding spotting in some preselected number of rows and columns, 
separate subsets (subarrays) may be formed. Samples in one subarray may be the same genomic 
segment of DNA (or the same gene) from different individuals, or may be different, overlapped 
genomic clones. Each of the subarrays may represent replica spotting of the same samples. In 

30 one example, a selected gene segment may be amplified from 64 patients. For each patient, the 
amplified gene segment maybe in one 96-well plate (all 96 wells containing the same sample). 
A plate for each of the 64 patients is prepared. By using a 96-pin device, all samples maybe 
spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from each patient. 
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Where the 96 subairays are identical, the dot span may be 1 mm 2 and there may be a 1 mm 

space between sub arrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 

Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
5 membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 

plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by exposure 

to fiat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of 

the present disclosure, one of skill in the art will appreciate that many other embodiments and 
10 variations may be made in the scope of the present invention. Accordingly, it is intended that 

the broader aspects of the present invention not be limited to the disclosure of the following 

examples. The present invention is not to be limited in scope by the exemplified embodiments 

which are intended as illustrations of single aspects of the invention, and compositions and 

methods which are functionally equivalent are within the scope of the invention. Indeed, 
1 5 numerous modifications and variations in the practice of the invention are expected to occur to 

those skilled in the art upon consideration of the present preferred embodiments. Consequently, 

the only limitations which should be placed upon the scope of the invention are those which 

appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated 
20 by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 
25 various human tissues and in some cases isolated from a genomic library derived from human 
chromosome using standard PCR, SBH sequence signature analysis and Sanger sequencing 
techniques. The inserts of the library were amplified with PCR using primers specific for the 
vector sequences which flank the inserts. Clones from cDNA libraries were spotted on nylon 
membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to obtain signature 
30 sequences. The clones were clustered into groups of similar or identical sequences. 
Representative clones were selected for sequencing. 
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In some cases, the 5 ! sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied 
Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences, 

5 5.2 EXAMPLE 2 

Assemblage of Novel Contigs 

The contigs of the present invention, designated as SEQ ID NO: 2083-2534 were 
assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend the 
seed EST into an extended assemblage, by pulling additional sequences from different 

10 databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and UniGene, and 
exons from public domain genomic sequences predicated by GenScan) that belong to this 
assemblage. The algorithm terminated when there were no additional sequences from the 
above databases that would extend the assemblage. Further, inclusion of component sequences 
into the assemblage was based on a BLASTN hit to the extending assemblage with BLAST 

1 5 score greater than 300 and percent identity greater than 95%. 

Table 8 sets forth the novel predicted polypeptides (including proteins) encoded by the 
novel polynucleotides (SEQ ID NO: 2083-2534) of the present invention, and their 
corresponding translation start and stop nucleotide locations to each of SEQ ID NO: 2083-2534. 
Table 8 also indicates the method by which the polypeptide was predicted. Method A refers to 

20 a polypeptide obtained by using a software program called FASTY (available from 

http://fasta.bioch.virginia.edu) which selects a polypeptide based on a comparison of the 
translated novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in 
Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B refers to a 
polypeptide obtained by using a software program called GenScan for human/vertebrate 

25 sequences (available from Stanford University, Office of Technology Licensing) that predicts 
the polypeptide based on a probabilistic model of gene structure/compositional properties (C. 
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference). 
Method C refers to a polypeptide obtained by using a Hyseq proprietary software program that 
translates the novel polynucleotide and its complementary strand into six possible amino acid 

30 sequences (forward and reverse frames) and chooses the polypeptide with the longest open 
reading frame. 
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5.3 EXAMPLE 3 
Novel Nucleic Acids 

The novel nucleic acids of the present invention SEQ ID NO: 1 - 1 04 1 were assembled 
from Hyseq 's proprietary EST sequences as described in Example 1 and human genome 
5 sequences that are available from the public databases fhttp://www.ncbi,nlm,nih.gov/) . 
Exons were predicted from human genome sequences using GenScan 
rhttp://genes.mit.edu/GENSCANinfo.html) ; HMMgene 

(Mp://wwwxbs.dtu.dk/semces/^^ 1 .html) ; and GenMark.hmm 

flitip://genemark.biologv.gatech > edu/GeneMark/whmm infahtml) . The Hyseq proprietary 

10 EST sequences and the predicted exons were assembled based on a BLASTN hit to the 

extending assemblage with BLAST score greater than 300 and percent identity greater than 
95%. Then, the predicted genes were analyzed using Neural Network SignalP VI. 1 program 
(from Center for Biological Sequence Analysis, The Technical University of Denmark) for 
presence of a signal peptide. These sequences were further analyzed for absence of a 

1 5 transmembrane region using the TMpred program 

HittD://wwwxh.embnet.org/software/TMPRED form.htmD . 

Table 1 shows the various tissue sources of SEQ ED NO: 1-1041. 
The homologs for polypeptides SEQ ID NO: 1042-2082, that correspond to 
nucleotide sequences SEQ ID NO: 1-1041 were obtained by a BLASTP version 2.0al 19MP- 

20 WashU searches against Genpept release 124 using BLAST algorithm. The results showing 
homologues for SEQ ID NO: 1042-2082 from Genpept 124 are shown in Table 2. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. 
Comp. Biol., Vol. 6, 219-235 (1999), http://motif.stanford.edu/ematrix-search/ herein 
incorporated by reference), all the polypeptide sequences were examined to determine 

25 whether they had identifiable signature regions. Scoring matrices of the eMatrix software 
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO 
databases. Table 3 shows the accession number of the homologous eMatrix signature found 
in the indicated polypeptide sequence, its description, and the results obtained which include 
accession number subtype; raw score; p-value; and the position of signature in amino acid 

30 sequence. 

Using the Pfam software program (Sonnhammer et al, Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences 
were examined for domains with homology to certain peptide domains. Table 4 shows the 
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name of the Pfam model found, the description, the e-value and the Pfam score for the 
identified model within the sequence. Further description of the Pfam models can be found 
at http://pfam.wustl.edu/ . 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San Diego, 
5 CA) was used to predict the three-dimensional structure models for the polypeptides 

encoded by SEQ ID NO 1-1041 (i.e. SEQ ID NO: 1042-2082). Models were generated by 
(1) PSI-BLAST which is a multiple alignment sequence profile-based searching developed 
by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High Throughput Modeling 
(HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated sequence 

10 and structure searching procedure (http://www.msi.com/) , and (3) SeqFold™ which is a fold 
recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-791 (1998)). 
This analysis was carried out, in part, by comparing the polypeptides of the invention with 
the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional structures 
as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier given to 

1 5 template structure; "Chain ID", identifier of the subcomponent of the PDB template 

structure; "Compound Information", information of the PDB template structure and/or its 
subcomponents; "PDB Function Annotation" gives function of the PDB template as 
annotated by the PDB files fhttp:/www.rcsb.org/PDB/) ; start and end amino acid position of 
the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, and the 

20 Potential(s) of Mean Force (PMF). The verify score is produced by GeneAtlas™ software 
(MSI), is based on Dr. Eisenberg' s Profile-3D threading program developed in Dr. David 
Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and Eisenberg, Nature, 
356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc, Natl. Acad. Sci. USA, 
95:13597-12502. The verify score produced by GeneAtlas normalizes the verify score for 

25 proteins with different lengths so that a unified cutoff can be used to select good models as 
follows: 

Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

30 The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 

function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
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model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 
model. A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

5 Table 6 shows the position of the signal peptide in each of the polypeptides and the 

maximum score and mean score associated with that signal peptide using Neural Network 
SignalP VI. 1 program (from Center for Biological Sequence Analysis, The Technical 
University of Denmark). The process for identifying prokaryotic and eukaryotic signal 
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, 
10 Soren Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, Vol. 
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and a mean 
S score, as described in the Nielson et al reference, was obtained for the polypeptide 
sequences. 

1 5 Table 7 correlates each of SEQ ID NO: 1 -1 041 to a specific chromosomal location. 

Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1- 
1041, their corresponding polypeptide sequences SEQ ID NO: 1042-2082, their 
corresponding priority contig nucleotide sequences SEQ ID NO: 2083-2534, their 
corresponding priority contig polypeptide sequences SEQ ID NO: 2535-2986, and the US 
20 serial number of the priority application in which the contig sequence was filed. 

Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1- 
1041, the novel polypeptide sequences SEQ ID NO: 1042-2082, and the corresponding SEQ 
ID NO in which the sequence was filed in priority US application 60/31 1,261. 
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Table 1 



* Tissue Origin 


RN A/Tissue Source 


Library Name 


SEQ ID NO: 


adrenal gland 


Clontech 


ADR002 


13 23 34 45 77 111 115 122 187 
194 210-211 249-250 255 290 
320 357-358 362420 443 451 
492 499 551 577 630 698 702 
713 718 805 808 819 841-843 
845 861 896 899 909 924 937 
949 985 1037 


adult bladder 


Invitrogen 


BLD001 


9 87 189 320-321 358 563 768 
840 970 


adult brain 


Clontech 


ABR001 


184-186 277 282 352 558 849 
871 898 958 


adult brain 


Clontech 


ABR006 


30 45 170 199 210 226 260 292- 
294 340 357 413 443-444 478 
499 551-552 579 582 584-588 
632-637 646 654-655 676 683 
731-732 755-756 777 813-827 
861 872 874 880 883 1002 1012 


adult brain 


Clontech 


ABR008 


15 45 54 61 67 81 87 101 106 
108 122-123 143-144 170 181- 
183 195-209 215 222 245-248 
261-270 283-289 292-293 296 
306 308-310 327 340 358 370 
394-407 409 421 428 440 442 
459 477-478 496 531-547 551- 
552 556 565-566 578-579 606 
618 620-621 629-630 651 653- 
655 664 667-668 707 713-714 
729 745 750 753 756 772 779 
788 790 793-794 799-800 802 
808 812 823 826-827 849-850 
859 862 872 883 885 898 917 
919 921 930 935-936 947 974 
985-986 992 1002 1006 1012 
1028 1030 1036 1039 


adult brain 


Clontech 


ABR011 


1012 


adult brain 


GIBCO 


AB3001 


23 57-58 67 85 296 492 499 579 
853 898-899 950 1012 


adult brain 


GIBCO 


ABD003 


45 59-62 67 72 82 85-88 156 
179-180 182 296 299 355-356 
440 458 474 483 499 563 823 
840 852 860 885 898 992 999 
1012 


adult brain 


Invitrogen 


ABR014 


45 1 15 238 470 599 653 974-976 


adult brain 


Invitrogen 


ABR015 


45 600 885 1012 


adult brain 


Invitrogen 


ABR016 


599 1012 


adult brain 


Invitrogen 


ABT004 


34 45 54 74 84 118 138-143 170- 
171 180-181208 255 277 359 
379 428 438499 501 536 715 
731 783 793 799 805 809 824 
862 898 912 977 998 1012 


adult cervix 


BioChain 


CVX001 


23 26 48 54 57 67 77 118 121 
177 183 238 255 271-272 296 
303 311-319 325 352 361-362 
411-412 419-420 424 428 440 
447 478 541 567 569 599-600 
622 699 793 805 813 831 836- 
837 839 844-845 848 863 872 
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Table 1 



Tissue Origin 


RN A/Tissue Source 


Library Name 


SEQ ID NO: 








913 928-929 944 958 965 970 
973 1001 1004 


adult colon 


Invitrogen 


CLN001 


250 322-325 429 630 788 970 
985 


adult heart 


GIBCO 


AHR001 


28-3045 61 67 90-94 118 122 
150-151 183 193 250-251279 
349-351 369-370 410 419 474 
483 485 490493 552 563 719 
773 835-836 853 861 961 976 
1030 


adult kidney 


GIBCO 


AKD001 


24 31-34 44-4648 55 62 67 81 
121 144 151 162 176-178 183 
251 255 258 277 352 358 369- 
370 386 408 420 429 483 490 
536 546 579 599-600 602 645 
698 793 805 874 898 913 


adult kidney 


Invitrogen 


AKT002 


32 53-54 67 85 177 251 260 341 
386 408 419-420 431-436 478 
490 493 507 561 582 596-599 
698 728 788 805 819 837 844- 
848 885 898 969 989 1013 


adult liver 


Clontech 


ALV003 


101 121 193 579 638-639 729 
890-893 919 1007 1017 


adult liver 


Invitrogen 


ALV002 


75 157 173 183 212-214 236 240 
263 292 323 335 386 408 415 
495-499 552 577 589 599 727 
782 858 869 898-900 924 968 


adult hang 


GIBCO 


ALG001 


67 77 152 369 386 419 443 483 
583 732 849 907 


adult ovary 


Invitrogen 


AOV001 


5 26 34 43 45 48 55 61-62 64-67 
77 87 101-102 105 115 118 122- 
129 143 151 155-163 170 174- 
175 177 181-183 193 251-252 
286 292 338 347 353-354 369 
381 410415 420 424 451458 
483 489497499 515 536 541 
546 552 577 579 595 599-600 
604 647 658 661 665 699 744 
782-783 800 805-806 814 831 
835 839-840 844 853 874 895 
898-899 913 924 929 941-942 
949 973 977 994 1004 1007 1012 
1016 1031 1037 


adult placenta 


Clontech 


APL001 


67 419 688 728 848 930 


adult spleen 


Clontech 


SPLcOl 


82 101 187 255 260 358 370 447 
483 489 579 586 648 768 835 
845 848 853-857 863 885 913 
917 962 986 


adult spleen 


GIBCO 


ASP001 


87 105 108 122 158 172 215 299 
380 492 499 552 599 622 785 
830 840 850 889 


adult testis 


GIBCO 


ATS001 


68-69 106 183 251 301 360 386 
520 541 570 753 788 832 840 
890 916 


bone marrow 


Clontech 


BMD001 


10-12 16-19 24-26 35 46 48 58 
77 85 95-96 98-99 122 156 164 
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Tissue Origin 


RN A/Tissue Source 


Library Name 


SEO ID NO: 








172 187222 251 385 424429 
458 478 483 489 519 568-569 
599 622-623 630-631 696 700 
758 765 794 844 914 919 924 
944 971985 992 1001 1017 


bone marrow 


GF 


BMD002 


23 45 81-82 104-105 115 136 
144 156 170172-173 181 183 
247 287 292 306 319-320 327 
362 370 4 1 8 478-483 489 492 
536 548-552 565 569-570 572 
579 596 599 614-622 630 640- 
641 643 653 668 691 699 708 
715-718 726 743 756 758 772 
789 841 889917 920 947 958 
994 1006 1010 1037 1039 


cultured preadipocytes 


Stratagene 


ADP001 


121255 400490-494 511 629 
689 758 793 835 861 913 944 
949 984 


endothelial cells 


Stratagene 


EDT001 


34 45 54 58 67 120-122 144 151- 
154 183 193 299 385 440451 
458 483 490499 515 552 563 
569 577 579 599 622-623 752 
793 800 844-845 898-899 942 
944 949 


fetal brain 


Clontech 


FBR001 


139 168 356 599 702 712 831 
845 850 872-873 898 921 1037 


fetal brain 


Clontech 


FBR004 


138 168 250 363 873-875 882 


fetal brain 


Clontech 


FBR006 


14 29 45 51 81 87 101 104 118 
131 143-144 157 171 177 206 
208-209 215 229 238 251 261 
273 279 283 291-293 326-332 
358 362 370-371 397 400 402 
413 419428 461 472 485 551- 
560 568-569 579 618 620 629- 
630 653-657 659-661 663-673 
675 700 714 739-742 744-746 
766 779 793 809 815 819 822 
840 850 859 862 872 875-885 
930 958 972 995 1002 1006 1028 
1030-1031 1038 


fetal brain 


GIBCO 


HFB001 


13-15 54-57 62 67 70-72 84 121 
174 177 180183 410417 424 
485 518 520 542 552 578-579 
599 785 793 805 831-832 840 
858 871 883 898-899 977 1012 


fetal brain 


Invitrogen 


FBT002 


7 45 49 144-149 157 180 255 263 
356 493 501 600 630 707 748 
832 845 858 913 1012 


fetal heart 


Invitrogen 


FHR001 


24 45 81-82 104 114-115 118 
121 144 152 181 239 24/ zoo 
292 327 362 370 381 419 428 
444 453 458 478 486 493 503 
569 571 576 582.596 618 640 
668 674-688 719-722 731 744 
753 762 772 784 794 819 823 
836 850 885 914 944 949 957- 
958 1017 
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'Tissue Origin 


RN A/Tissue Source 


Library Name 


SEO ID NO: 


fetal kidney 


Ciontech 


FKD001 


82 107 208 458 483 485 536 758 
760 819 836 894 1017 


fetal kidney 


Clontech 


FKD002 


61 101 105 183 189 238 247 263 
292 327 340 370 405 416 419 
517 569 586 620 648 668 689- 
691 731 746-752 763 771-772 
787-788 819 840 842 854 861 
872 944 958 961 969 


fetal kidney 


Invitrogen 


FKD007 


116 


fetal liver 


Clontech 


FLV002 


410 429 454 692-695 704 781 
805 894-895 1017 


fetal liver 


Clontech 


FLV004 


67 107 115 118151 187 241 255 
287 370 466 478 492 518 548 i 
552 569 582 589 630 653 668 
696-699 752-757 784 789 805 
885 908 985 


fetal liver / 


Invitrogen 


FLV001 


45 101 130-137 157 222 240 337 
386 428-429 492 552 589 693 
727 840 


fetal liver-spleen 


Columbia 
University 


FLS001 


1-9 18 20-23 27 34 36-38 45 55 
67 70 83 89 94 118 122158 164 
172-173 177 183 219 238 240 
246 251 292 299 323 335 338 
358 369 376 385-386 397 408 
416 419 421-422 429 451 456- 
460 466 472 478 483 489^90 
493 516 536 543 546 551 569- 
573 579 586 588-589 593-595 
599-603 619 622 668 676 691 
699 702 724 731 734 743 787 
789 794 800 805 834-835 840 
848 853 874 880 885 890-89 1 
899 908 910 923 926-927 930 
939-940 944 949 958 973 980 
992 999 1004 1007 1009 1013 


fetal liver-spleen 


Columbia 
University 


FLS002 


3 8 17 22 36-37 46 55 61 63 70 
72 85 89-90 94 106 122 148 156 
158 165 172 177 181 194 213 
215 219 246 251292 299 304- 
307 323-324 338 346 355 366 
371 374 380-381 386 392 397 
410 417 421 440 455 462^64 
466-468 489-490 492-493 507- 
521 536 552 565-566 569 571- 
576 592 596 599 619 630 650 
655 661 688 698-699 712 718 
723-729 731 735-737 753 767 
783 824 831 834 840 845 871 
885 891 894 899 902 906-909 

rn i noi fkin 04n QA1 QAO Q*\R 

913 yzj-yiu v4U y*o y**y yoo 
973 980 992 999 1003 1007 1017 
1032 1040-1041 


fetal liver-spleen 


Columbia 
University 


FLS003 


23 67 106 150 158 193 338 374 
376411 443 478493 546 565 
569-570 582 589 609-613 630 
661 699 724 727-734 767 809 
812 834-835 845 880 890 910 
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RNA/Tissue Source 


Library Nome 


SEQIDNO: 








929-930 958 973 980 985 1013 


fetal lung 


Clontech 


FLG001 


728 824 1008 


fetal lung 


Clontech 


FLG004 


115 668 


fetal lung 


Invitrogen 


FLG003 


120 183 322 333-336476 516 
691 831 835 850 1012 


fetal muscle 


Invitrogen 


FMS001 


45 338-339 365 369 386 429 431 
496-497 789 793 856 970 1008 
1019 1033 1035 


fetal muscle 


Invitrogen 


FMS002 


45 1 15 171 247 327 365 370 405 
536 642-652 668 710-711 719 
726 758-761 765 836 899 901 
907 913 948 965 1037 


fetal skin 


Invitrogen 


FSK001 


29 57 67 74 81 118 152 177 180 
193 294 340-342 345 375 397 
419 437-443 445-451 454 475 
532 541 546 565 598 604 630 
650 668 728 742 772 789 793 
804-805 823 828-830 837 840 
849 899 901 922 958 970 1007 
1022 1033 


fetal skin 


Invitrogen 


FSK002 


34 45 77 81 85 115 173 200 279 
292-293 360 370 381 419 428- 
429 451 466 490 551 569-570 
579 600 604 630 647 668 698 
700-706 729 731 746 750 758 
762-766 768-773 780 794 840 
850 859 861 885 901 911 913 
957 961 965 973 1038 


fibroblast 


Stratagene 


LFB001 


55 72 143 255 490 502-505 587 
599 627 861 863 885 984 1037 


induced neuron-cells 


Stratagene 


NTD001 


30 82 111 124 181 206 356 392 
410 417 484-488 578 831-834 
898 977 1036 1039 




Columbia 
University 


IB2002 


18 21 45 66 73-75 100-103 118 
152 168-171,177 180 241-242 
252 292-295 340 345 366-367 
413 438 454 499 501 542 561- 
562 578-580 599 668 702 728- 
729 745 765 768 772 793 796- 
799 823-824 863 874 887 899 
948-949 967 975 977 981 983 
992 995 1012 


infant brain 


Columbia 
University 


IB2003 


81 101 113 118 177 180 241 252 
293 340 345 367 371 379 381 
400 417 499-501 536 562 578 
580-581 629-630 702 713 745 
796-805 824 831 837 840 845 
874 885 967 977 981 985 1012 
1030 


infant brain 


Columbia 
University 


IBM002 


168 358 413-414 913 


infant brain 


Columbia 
University 


1BS001 


415 417 533 581 886-888 977 


leukocyte 


Clontech 


LUC003 


77 619 889 949 


leukocyte 


GIBCO 


LUC001 


34 36 38-42 50-52 55 67 77 81- 
83 85 121 137 144 158 172 183 
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'Tissue Origin 


RN A/Tissue Source 


Library Name 


SEOIDNO: 

223 226 251 254 258 291 324 
368-374 378 424 429 443 483 
492 536 552 564 600 602 732 
760 768 782 785 805 838 844- 
845 848 850 889 898 905 908 
946 973 992 


lung 


55 72 143 255 490 
502-505 587 599 
627 861 863 885 
984 1037 






lung tumor 


Invitrogen 


LGT002 


55 61 65 77-79 82 102105 115 
156-157 165-167 170 182-183 
197 243-244 251 253 296-297 
325 370 386 418-419 421-425 
478 483 492 499 520 531 533 
541 569 577 582 600 788 844- 
845 848 874 899 911 913 916- 
918 939 944 949 956 970 976 


lymph node 


Clontech 


ALN001 


47 63 104-105 183 483 492 691 
894 1017 


lymphocytes 


ATCC 


LPC001 


45 53 77 158 193 251 392 421 
455 469-474 483 507 536 546 
579 581 618 621 640 765 780- 
787 793 838 845 875 924 968 
978 999 


macrophage 


Invitrogen 


HMP001 


122 147 157 183 251 255 493 
738 898-899 903-905 


mammary gland 


Invitrogen 


MMG001 


45 64 67 83-84 101 1 13 143 148 
152 158 164 177 181-183 189 
216-218 253 255 258 263 274 
299 336 419 421 423 426-430 
440 466 478 490 520 533 536 
564 569 579 5 82 630 646 753 
768 782 789 800 835 840 848 
850 883 912-913 944 950 958 


melanoma from-cell-line- 
ATCC-#CRL-1424 


Clontech 


MEL004 


62 158 181 298 362 364 402 419 
515 536 896-897 958 973 1004 
1008 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGdOlO 


353 358 823 942 982 1020 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGdOll 


569 630 944 955 999 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd012 


9 38 59 63 80 85 122-123 152 
154 177 195 217 232 246 250 
296 300 306 323-324 381 427 
434 438-439 478 489 499 507 
517 538 558 565 571 575 630 
657 681701 736 762 792 800 
oa? fm.ROd 861 871-872 899 
929 941 955 968 974 985-1003 
1006 1011-1012 1033 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd013 


232 434 748 956-958 992 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd015 


18 69 115 324335 548 551 569 
582 600 622 731 819 899 911 
944 957-958 1012 1017-1018 
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"Tissue Origin 


RN A/Tissue Source 


Library Name 


SEO n> NO: 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd016 


46 172 183 323 371 481 493 565 
569 571596 599 630 654 698 
745 762 786 849 907 944 1004- 
1013 1037 1039 


neuronal cells 


Stratagene 


NTU001 


7 33 45 107113 121 150 183 286 
385 440 478 483 485 487 489 
536 569 582 756 768 772 819 
836 944 958 966 1001 


pituitary gland 


Clontech 

• 


PIT004 


158 222 255 345 356 370 379 
569 579 819 831 861-862 885 
898 922 1017 


placenta 


Clontech 


PLA003 


7 36 61 279 419 478 489 582 586 
599 641 647 668 681 707-711 
774-779 1001 


placenta 


Invitrogen 


APL002 


57 173 536 728 793 800 


prostate 


Clontech 


PRT001 


26 219-222 229 412 599 665 762 
835 837 860 878 951 1031 ■ 


rectum 


Invitrogen 


REC001 


9 292 343-346 431 546 714 800 
863 918 


retinoic acid-induced- 
neuronal-cells 


Stratagene 


NTR001 


112 400478 569 582 629 756 
758 800 819 831 835-836 850 
906 944 958 


salivary gland 


Clontech 


SAL001 


58 6177 118 150 158 294 347- 
348 483 492-493 546 752 830 
915 


skeletal muscle 


Clontech 


SKM001 


80 1 18 247 365 483 719 805 812 
823 


small intestine 


Clontech 


SIN001 


34 37 45 52 60 93 106 119 121 
138 144 177 180 208 223-225 
238 247 294 323 335-336 343 
362 370 380 386 397 409-411 
416 420 440 45 1 455 478 489 
493 536 571 577 579 590 602 
604-608 614 622 624-628 655 
668 688 700 714 805-812 831 
841 872 894 899 914 924 926 
929 958 961 965 973 991 998 
1017 


spinal cord 


Clontech 


SPC001 


51 164 182-183 190 226-228 
255-257 275-277 286 296 299 
451 454 542 552 579 591 728 
753 770 786 790 831 835 849- 
852 898 907 958 1000 1012 


stomach 


Clontech 


STO001 


72 222 232 247 258 366 645 


thalamus 


Clontech 


THA002 


45 49 1 13 155 164 180 183 191- 
192 208 229-232 238 345 417 
443 512 551 558 592 630 728 
800 823 840 858-860 885 898 

c\nc into 


thymus 


Clontech 


THM001 


45 141 160 183 258 360 378-379 
418 451460 569 602 619 731 
788-790 819 835 845 958 965 
1004 


thymus 


Clontech 


THMc02 


47 108 115 121 144 157 173 247 
259-260 300 327 340 358 362 
375-393 409 453 455 461 478- 
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479 489 551 565 569-570 579 
582 615 630 640 653 668 708 
744 752 758 766 790-795 810 
819 823 835-836 845 850 853 
861 885 911 919 938 958 962 
994 1001 1027 


thyroid gland 


Clontech 


THR001 


46 58 67 80 82 144 160 177 183 
193-194 233-235 251 255 263 
268 278-280 286 299 301-303 
324 358 370 386 397 408 410 
420 440 474 483 493 506 519- 
520 533 594 599-600 602 658 
661 719 758 772 785 788 793 
830 851 853 864-867 898 904 
909 924 929 961 973 991 998 
1001 1009 


trachea 


Clontech 


TRC001 


45 154 236 238 281 323 416 571 
602 868-869 913 


umbilical cord 


BioChain 


FUC001 


34 45 54 58 67 70 85 152 154 
177 180 188 208 251299 370 
409 415 419 434451-455 483 
596 599 647 661 733 742 793 
808 839-840 845 849-850 861 
888 911 913 992 


uterus 


Clontech 


UTR001 


177 237-239 255 258 417 493 
520 567 599 604 646 844 870 
874 898 973 


young liver 


GBBCO 


ALV001 


45 419 440 443 490 653 732 753 
805 845 898 904 



The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA (Invitrogen), 4) Normal adult liver 
mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) 
normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) Human bone marrow 
mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 1 1) Human thymus mRNA 
(Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) 
human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional 
umbilical cord mRNA (BioChain). 
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Accession 
No. 
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Description 
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% 
Identity 


1044 


AAB32400 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 30 SEQ ID 
NO:86. 


339 


100 


1044 


AAM74711 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35017. 


335 


100 


1044 


AAM61909 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 34014. 


335 


100 


1045 


gi3859599 


Arabidopsis 
thaliana 


similar to class I chitinases (Pfam: 
PF00182, E=1.2e-142,N=1) 


74 


27 


1045 


gil5292107 


Drosophila 
melanogaster 


LD38671p 


74 


33 


1045 


gi2258324 


Fusarium 
oxysporum f. sp. 
ciceris 


yellowing-associated protein 


73 


32 


1046 


gi 17428204 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


74 


32 


1046 


gi43 14432 


Homo sapiens 


similar to phosphatidylinositol 
(4,5)bisphosphate 5-phosphatase; 
match to PID:gl399105 


71 


30 ; 


1046 


gi|17545909| 
refJNP 5193 
11.11 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


74 


32 


1047 


gi9756017 


Actinoplanes sp. 
50/110 


alpha-amylase 


69 


38 


1047 


gi|6572499|g 
b|AAF17291 
-11 


Homo sapiens 


LHX3 protein 


67 


26 


1047 


gi|18572988| 
refpCP 0291 
70.2) 


Homo sapiens 


LM homeobox protein 3 


67 


26 


1048 


AAY28474- 


Homo sapiens 


UYJO Human Capon protein. 


721 


99 


1048 


gi2895555 


Homo sapiens 


carboxyl^erminal PDZ ligand of 
neuronal nitric oxide synthase 


721 


99 


1048 


gi2895557 


Rattus 
norvegicus 


carboxyl-terminal PDZ ligand of 
neuronal nitric oxide synthase 


654 


92 


1049 


gil97 13721 


Fusobacterium 
nucleatum subsp. 
nucleatum 
ATCC 25586 


GTP-binding protein era 


66 


28 


1050 


gi31291 


Homo sapiens 


fumarylacetoacetase (AA 1-349) 


175 


70 


1050 


gil 82393 


Homo sapiens 


fumarylacetoacetate hydrolase 


175 


70 


1050 


gil2803409 


Homo sapiens 


furnarylacetoacetate 


175 


70 


1052 


gi4680089 


Human 

immunodeficienc 
y virus type 1 


envelope glycoprotein 


79 


26 


1052 


gi3868997 


Ephydatia 
fluviatilis 


EFPDE2 


74 


20 


1052 


gi4679590 


Human 

immunodeficienc 
y virus type 1 


envelope glycoprotein 


74 


25 


1054 


gi3844648 


Mycoplasma 
genital ium 


glycerol kinase (glpK) 


71 


28 
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1054 


gil8448155 


Ipomoea leaf 
curl virus 


AC3 


70 


27 


1054 


gi| 12044888] 
ref]NP 0726 
98.1| 


Mycoplasma 
genitalium 


glycerol kinase (glpK) 


71 


28 


1056 


AAM56747 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28852. 


229 


72 


1056 


AAM67067 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27373. 


224 


69 


1056 


AAM54664. 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26769. 


224 


69 


1058 


gi|13310191| 
gb|AAK181 
89.1|AF331 
500_1 


multiple 

sclerosis 

associated 

retrovirus 

element 


recombinant envelope protein 


228 


79 


1058 


gi|21 103962| 
gb)AAM331 
41.11 


Homo sapiens 


enverin-2 


209 


77 


1058 


gi|8272468|g 
b|AAF74215 
,1|AF15696 
3 1 


Homo sapiens 


envelope protein 


198 


75 


1059 


gi20380199 


Homo sapiens 


Similar to LOC 168246 


251 


100 


1059 


gi|8388692|e 
mb|CAB940 
42.1| 


Leishmania 
major 


probable DNA-binding protein 


67 


46 


1060 


gi|2 1292780| 

gb|EAA049 

25.1| 


Anopheles 
gambiae str. 
PEST 


agCP4203 


70 


39 


1061 


gi330862 


Equine 
herpesvirus 1 


membrane glycoprotein 


179 


30 


1061 


gil7221106 


Equine 
herpesvirus 1 


glycoprotein gp2 


178 


34 


1061 


AAE03643 


Homo sapiens 


INCY- Human extracellular matrix and 
cell adhesion molecule-7 (XMAD-7). 


175 


29 


1062 


gi|l 10371171 
gb|AAG274 
S5.1|AF194 
537 1 


Homo sapiens 


NAG13 


334 


66 


1062 


gi|1335205|e 
mb|CAA364 
80.1| 


Homo sapiens 

• 


ORFH 


332 


66 


1063 


gi21323402 


Corynebacterium 
glutamicum 
ATCC 13032 


ABC-type transporter, periplasmic 
component 


70 


36 


1063 


gi|19551869| 
ref]NP 5998 
71.1| 


Corynebacterium 
glutamicum 


COG 1464: ABC-type uncharacterized 
transport systems, periplasmic 
component 


70 


36 


1063 


gi|17551878| 
reflNP 4990 


Caenorhabditis 
efegans 


TPR Domain 


67 


37 
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90.1| 










1064 


gi2308977 


Aspergillus 
nidulans 


chitin synthase 


66 


29 


1065 


gi!8076958 


Yarrowia 
lipolytica 


Optl protein 


74 


30 


1065 


gi786145 


Walleye dermal 
sarcoma virus 


envelope polyprotein 


73 


28 


1065 


gi2801522 


Walleye dermal 
sarcoma virus 


gPrenv 


73 


28 


1066 


gi9294279 


Arabidopsis 
thaliana 


Ta 1 1 -like non-LTR retroelement 
protein-like; CHP-rich zinc finger 
protein-like 


67 


32 


1066 


gi|20848817| 
reflXP_1380 
10.1| 


Mus musculus 


similar to HEAT SHOCK COGNATE 
PROTEIN 80 


83 


69 


1069 


AAM77637 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 37943. 


96 


65 


1069 


AAM64901 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 37006. 


96 


65 


1069 


gill 74737411 
reflXP 0623 
80.1| 


Homo sapiens 


similar to Meningioma-expressed 
antigen 6/1 1 (MEA6) (MEA1 1) 


112 


56 


1070 


gi296288 


Homo sapiens 


histone HI 


77 


44 


1070 


gi5923857 


Artemisia annua 


squalene synthase 


75 


35 


1070 


AAO08837 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22729. 


73 


39 


1071 


gi21483554 


Drosophila 
melanogaster 


SD02058p 


72 


29 


1071 


gi8515845 


Homo sapiens 


hepatocellular carcinoma associated 
protein TD26 


71 


38 


1071 


gi|21483554| 
gb|AAM527 
52.1| 


Drosophila 
melanogaster 


SD02058p 


72 


29 


1072 


gi5902896 


Streptomyces 
avermitilis 


type I polyketide synthase AVES 4 


74 


50 


1072 


gi|21301752| 

gb|EAA138 

97.1| 


Anopheles 
gambiae str. 
PEST 


agCP8235 


70 


34 


1073 


AAV30916_ 
aal 


Homo sapiens 


GEMY Human secreted protein 
AR415 4 cDNA. 


99 


66 


1073 


ABB89113 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 1489. 


99 


66 


1073 


AAB90679 


Homo sapiens 


GEMY Human AR415 4 protein 
sequence SEQ ID 35. 


99 


66 


1074 


AAG99338 


Homo sapiens 


TAKE Human atypical tachykinin 
protein fragment SEQ ID NO: 20. 


380 


no 

yz 


1074 


AAG99336 


Homo sapiens 


TAKE Human atypical tachykinin 
protein fragment SEQ ID NO: 13. 


329 


91 


1074 


AAG99333 


Homo sapiens 


TAKE Human atypical tachykinin 
protein fragment SEQ ID NO: 3. 


324 


91 


1075 


gil7945760 


Drosophila 
melanogaster 


RE33302p 


305 


29 
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1075 


gil039447 


Saccharomyces 
cerevisiae 


Lpblp 


91 


25 


1075 


AAB64777 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 5 SEQ ID 
NO:63. 


78 


77 


1076 


AAB50261 


Homo sapiens 


CORI- Human breast cancer associated 
B726P-20 protein. 


308 


39 


1076 


AAB50244 


Homo sapiens 


CORI- Human breast cancer associated 
B726P-79 protein. 


308 


39 


1076 


AAB84702 


Homo sapiens 


CORK Amino acid sequence of a 
human cancer associated antigen. 


308 


39 


1077 


gi2529735 


Gorilla gorilla 


glycophorin B/E precursor 


71 


31 


1077 


AAB74724 


Homo sapiens 


INCY- Human membrane associated 
protein MEMAP-30. 


70 


31 


1077 


gi4 164424 


Schizosaccharom 
yces pombe 


similar to yeast cytoskeleton control 
protein Bnilp 


70 


24 


1078 


gil8145107 


Clostridium 
perfringens 


probable transcriptional regulator 


71 


28 


1078 


gi|9581801|e 
mb|CAC005 
46.1| 


Plasmodium 
falciparum 


guanylyl cyclase 


69 


24 


1078 


gi|16805032| 
reflNP 4730 
61.11 


Plasmodium 
falciparum 


Ser/Thr protein kinase 


69 


26 


1079 


gi|20886321| 
reflXP 1406 
14.1| 


Mus musculus 


similar to olfactory receptor, family 5, 
subfamily V, member 1 ; olfactory 
receptor, family 5, subfamily V 
member 1 


72 


34 


1081 


gi9650824 


Petroselinum 
crispum 


common plant regulatory factor 5 


76 


28 


1081 


gi559695 


Hydrolagus 
colliei 


This CDS feature is included to show 
the translation of the corresponding 
C_region. Presently translation 
qualifiers on C_region features are 
illegal 


74 


31 


1081 


gi476622 


Hydrolagus 
colliei 


immunoglobulin light chain 


74 


31 


1082 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


363 


71 


1082 


AAO07159 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 21051. 


357 


76 


1082 


AAM40991 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5922. 


343 


79 


1083 


gi|17229222| 
reflNP 4857 
70.1| 


Nostoc sp. PCC 
7120 


similar to HetF protein 


72 


30 


1084 


gil7221628 


Felis catus 


T-lymphocyte surface CD2 antigen 


76 


Jo 


1084 


gil8565073 


Crimean-Congo 
hemorrhagic 
fever virus 


envelope glycoprotein precursor 


74 


29 . | 


1084 


gi)17221628| 
dbjtBAB784 
75.1| 


Felis catus 


T-lymphocyte surface CD2 antigen 


76 


38 


1085 


gil7430213 


Ralstonia 


PUTATIVE HEMAGGLUTINTN- 


74 


26 
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solanacearum 


RELATED PROTEIN 






1087 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


618 


79 


1087 


gi|4996596|d 
bj|BAA7854 
9.11 


Human 
endogenous 
retrovirus W 


polyprotein 


317 


74 


1087 


gi|9630708|r 
ef|NP 0472 
55.11 


Feline leukemia 
virus 


gag-pol precursor polyprotein gPr80 


293 


38 


1088 


gil5075953 


Smorhizobium 
meliloti 


PUTATIVE MOLYBDENUM 
TRANSPORT SYSTEM PERMEASE 
ABC TRANSPORTER PROTEIN 


70 


56 


1088 


gi2288880 


Arthrobacter 
nicotinovorans 


transmembrane protein 


67 


56 


1088 


gil7298547 


Bradyrhizobium 
japonicum 


ModB 


67 


56 


1089 


AAY95660 


Homo sapiens 


ZYMO Human Zntr2 protein. 


231 


61 


1089 


AAU83682 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
182. 


210 


59 


1089 


AAY99386 


Homo sapiens 


GETH Human PRO1305 (UNQ671) 
amino acid sequence SEQ ID NO: 153. 


210 


59 


1090 


gj7688355 


Solarium 
tuberosum 


Dof zinc finger protein 


70 


31 


1090 


gi4389445 


Drosophila 
melanogaster 


transcription factor 


67 


32 


1090 


gi|7688355]e 
mb|CAB898 


Solanum 
tuberosum 


Dof zinc finger protein 


70 


31 


1092 


AAG78884 


Homo sapiens ! 


BIOW- Human ribosomal protein s5- 
17. 


90 


44 


1092 


AAM91239 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 18832. 


72 


53 


1092 


AAM95026 


Homo sapiens 


HUMA- Human reproductive system 
related antigen SEQ ID NO: 3684. 


72 


48 


1094 


gil8676450 


Homo sapiens 


FTJ00122 protein 


69 


38 


1094 


gil8073428 


Homo sapiens 


stabilin-2 


69 


38 


1094 


gi|20806091| 
ref|NP 0600 
34.8| 


Homo sapiens 


stabilin-2; CD44-like precursor FELL 


69 


38 


1095 


gi20906397 


Methanosarcina 
mazei Goel 


conserved protein 


76 


44 


1095 


gi|21299784| 

gb|EAA119 

29.11 


Anopheles 
gambiae str. 
PEST 


agCP6531 


75 


30 


1095 


gi|17549046| 
ref]NP 5223 
86.1| 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


73 


32 


1096 


AAB58317 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 655. 


678 


100 


1096 


gi862600 


Drosophila 
melanogaster 


male-specific lethal- 1 protein 


176 


25 
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1096 


gi601930 


Oryctolagus 
cuniculus 


neurofilament-H 


115 


24 


1097 


AAU83109 


Homo sapiens 


ZYMO Novel secreted protein 
Z701935G4P. 


76 


85 


1097 


gi|20348496| 
reflXP 1117 
12.1| 


Mus rausculus 


similar to RIKEN cDNA 9030605E16 


72 


57 


1098 


gil8031887 


Mus musculus 


Fanconi anemia complementation 
group G 


77 


29 


1098 


#12002137 


Mus musculus 


Fanconi anemia group G protein 


77 


29 


1098 


AAB72381 


Homo sapiens 


LEEM/ Human hairy and enhancer of 
Split homologue amino acid sequence. 


75 


28 


1099 


gi8217648 


Homo sapiens 


dJ579F20.1 (high-mobility group 
(nonhistone chromosomal) protein 1- 
likel) 


159 


70 


1099 


#5815432 


Gallus gallus 


high mobility group protein HMG1 


154 


70 


1099 


gi4140289 


Gallus gallus 


high mobility group 1 protein 


154 


70 


1100 


ABB11527 


Homo sapiens 


HYSE- Human apolipoprotein B 
receptor homoloeue, SEO ID NO: 1897. 


84 


26 


1100 


gi487347 


Homo sapiens 


breakpoint cluster region protein 


81 


32 


1100 


gil44050 


Bordetella 
pertussis 


filamentous hemagglutinin 


78 


30 


1102 


AAM68946 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29252. 


327 


81 


1102 


AAM79768 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3414. 


324 


80 


1102 


AAM78784 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1446. 


324 


80 


1103 


AAZ11186_ 
aal 


Homo sapiens 


SAGA Gene encoding transmembrane 
domain containing protein clone 
HP02239. 


143 


68 


1103 


AAD31079_ 
aal 


Homo sapiens 


INCY- Human cornichon protein 
(CORN) cDNA. 


143 


68 


1103 


AAA88439_ 
aal 


Homo sapiens 


GETH Antitumour PR0181 cDNA 
clone DNA23330-1390. 


143 


68 


1104 


ABB07527 


Homo sapiens 


INCY- Human drug metabolizing 
enzyme (DME) (ID: 5643401CD1). 


562 


100 


1104 


ABB07515 


Homo sapiens 


INCY- Human drug metabolizing 
enzyme (DME) (ID: 8097779CD1). 


562 


1 r\rv 
100 


1104 


©13161409 


Mus musculus 


family 4 cytochrome P450 


431 


76 


1107 


gil3542874 


Mus musculus 


Similar to CGI-67 protein 


677 


64 


1107 


AAU81978 


Homo sapiens 


INCY- Human secreted protein SECP4. 


665 


65 


1107 


AAU77137 


Homo sapiens 


MILL- Human alpha/beta hydrolase 
3 86 18 polypeptide. 


665 


65 


1108 


gil3620885 


Homo sapiens 


mitochondrial ribosomal protein S6 


323 


100 


1108 


gil3620887 


Mus musculus 


mitochondrial ribosomal protein S6 




Ox. 


1108 


gil9713140 


Fusobacterium 
nucleatum subsp. 
nucleatum 
ATCC 25586 


Fusobacterium outer membrane protein 
family 


79 


28 


1109 


gil8378673 


Homo sapiens 


PATE 


607 


89 


1109 


gi5305l93 


Rattus 
norvegicus 


sperm protein 10 


108 


30 
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1109 


gi969103 


Mus musculus 


mSP-10 


1 AT 
10/ 


z/ 


1110 


gi2462979 


Bos taurus 


Tenascin-X 


lift 

119 


34 


1110 


gi3413958 


Homo sapiens 


LDL receptor related protein 105 


110 


27 


1110 


gil3938519 


Homo sapiens 


low density lipoprotein receptor-related 
protein 3 


110 


27 


1111 


gil7981053 


Mus musculus 


transcription factor NFAT5 


82 


32 


mi 


gil5425825 


Mus musculus 


tonicity-responsive enhancer binding 
protein 


82 


32 


mi 


gi6911148 


Mus musculus 


transcription factor NFAT5 isoform b 


82 


32 


1112 


gi6634473 


Metarhizium 
anisopliae var. 
anisopliae 


adenylate cyclase, ACY 


73 


30 


1113 


AAU19759 


Homo sapiens 


HUMA- Human novel extracellular 
matrix protein, Seq ID No 409. 


900 


70 


1113 


gi3171934 


Mus musculus 


neuronal-STOP protein 


886 


52 


1113 


gi2769587 


Mus musculus 


STOP protein 


885 


52 


1114 


gil8652188 


Oenococcus oeni 


OppF 


72 


41 


1115 


gi9119 


Drosophila sp. 


fos-related antigen 


69 


37 


1115 


gi7769652 


Drosophila 
melanogaster 


Fos-related antigen 


69 


37 


1115 


gil7862946 


Drosophila 
melanogaster 


SD04477p 


69 


37 


1116 


gi21212948 


Mus musculus 


peroxisomal protein (PeP) 


243 


83 


1116 


Ri2347114 


Mus musculus 


CC chemokine receptor-5 


72 


28 


1116 


gi2431976 


Mus musculus 


CCR5 


72 


28 


1117 


gi|20825251| 
reflXP 1319 
98.1| 


Mus musculus 


similar to RE 1 -silencing transcription 
factor, neuron restrictive silencer 
factor, repressor binding to the X2 box 


77 


40 


1117 


gi|15597871| 
refjNP 2513 
65.1| 


Pseudomonas 
aeruginosa 


probable type IT secretion system 
protein 


69 


41 


1118 


gi|3860513|e 
mb|CAA135 
74.11 


Mus famulus 


reverse transcriptase 


303 


82 


1118 


gip860536je 
mb|CAA135 
77.1| 


Mus saxicola 


reverse transcriptase 


303 


81 


1118 


gi|3860510|e 
mb|CAA135 
73.1| 


Musdunni 


reverse transcriptase 


298 


63 


1119 


AAO04758 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 18650. 


234 


59 


1119 


AAM69569 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29875. 


220 


63 


1119 


AAM67717 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 28023. 


219 


49 


1120 


gi21107877 


Xanthomonas 
axonopodis pv. 
citri str. 306 


cytochrome C 


78 


27 


1120 


gil5292331 


Drosophila 
melanogaster 


LD47230p 


77 


42 


1120 


gil5072444 


Avian 


phosphoprotein 


72 


38 
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paramyxovirus 6 








1121 


AAB44126 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1571. 


150 


83 


1121 


gi550015 


Homo sapiens 


ribosomal protein L21 


150 


83 


1121 


gi619788 


Homo sapiens 


L21 ribosomal protein 


150 


83 


1122 


AAU74448 


Homo sapiens 


OULU- Human protein sequence of 
lysyl hydroxylase 1 (LH1). 


125 


100 


1122 


gil90074 


Homo sapiens 


lysyl hydroxylase 


125 


100 


1122 


gi5817297 


Homo sapiens 


lysyl hydroxylase 1 


125 


100 


1123 


gi21281601 


Caenorhabditis 
elegans 


C. elegans PQN-44 protein 
(corresponding sequence F55A12.9c) 


78 


34 


1123 


gil4578225 


Caenorhabditis 
elegans 


C. elegans PQN-44 protein 
(corresponding sequence F55A12.9b) 


76 


38 


1123 


gi2088669 


Caenorhabditis 
elegans 


C. elegans PQN-44 protein 
(corresponding sequence F55A 12.9a) 


76 


38 


1125 


AAU17301 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 866. 


344 


88 


1125 


AAE11776 


Homo sapiens 


INCY- Human kinase (PKIN)-10 
protein. 


344 


88 


1125 


AAU17304 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 869. 


340 


86 


1126 


AAM41712 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6643. 


152 


96 


1126 


AAM39926 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3071. 


152 


96 


1126 


AAM79067 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1729. 


152 


96 


1127 


AAE02938 


Homo sapiens 


MILL- Human adenylate cyclase 
25678. 


252 


98 


1127 


AAB02006 


Homo sapiens 


TEXA Adenylyl cyclase type II-C2 C2 
alpha domain. 


252 


98 


1127 


gi202752 


Rattus 
norvegicus 


adenylyl cyclase type II 


252 


98 


1128 


AAA94860_ 
aal 


Homo sapiens 


TEXA Human caspase activator Smac 
coding sequence. 


96 


100 


1128 


AAU78447 


Homo sapiens 


UYJE- Inhibitor of apoptosis (IAP) 
protein Smac. 


96 


100 


1128 


AAB26210 


Homo sapiens 


TEXA Human caspase activator Smac. 


96 


100 


1129 


gi3874765 


Caenorhabditis 
elegans 


Similarity to Drosophila acetylcholine 
receptor protein 

(SW:ACHl_DROME), contains 
similarity to Pfam domain: PF00065 
(Neurorransmitter-gated ion-channel), 
Score=296.9, E-value=5e-86, N=3 


97 


30 


1129 


gi6681597 


Yaba monkey 
tumor virus 


similar to vaccinia G8R 


72 


28 


1129 


gi| 17548199| 
reflNP 5099 
32.1| 


Caenorhabditis 
elegans 


acetylcholine receptor 


97 


30 


1130 


gi|17564116| 
reflNP 5064 
84.1| 


Caenorhabditis 
elegans 


tyrosine-protein kinase 


73 


29 


1131 


gil3925613 


Homo sapiens 


insulinoma-associated protein IA-6 


88 


27 


1131 


gil 58485 


Drosophila 


son of sevenless protein 


85 


24 
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melanogaster 








1131 


gi7287782 


05-Feb-1998 


symbol=Sos; 

synonym=BG:DS00941 .4; 
match=method:"sim4", score:" 1000.0", 
desc: H GenBank: :M8393 1 :Drosophila 
melanogaster son of sevenless (Sos) 
mRNA, complete cds. CDS:346..5133; 
PID:gl58485. n , species :'TDrosophila 
melanogaster"; 
match=method: n BLASTX M , 
version:"2.0al9MP-WashU [Build 
sol2.5-ultra 01:47:30 


85 


24 


1132 


gi9696 


Mytilus edulis 


polyphenols adhesive protein 


75 


25 


1134 


gil3562016 


Plectreurys tristis 


fibroin 2 


72 


29 


1134 


gil 129074 


Bacillus subtilis 


beta-N-acetylglucosarrrinidase 


69 


28 


1134 


gi2636104 


Bacillus subtilis 


N-acetylglucosanoinidase (major 
autolysin) (CWBP90) 


69 


28 


1135 


AAB58870 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ID 578. 


72 


80 


1135 


gil 1595476 


Homo sapiens 


RPBllblbeta protein 


72 


80 


1135 


AAB44840 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 1 1 . 


69 


45 


1137 


gi206985 


Rattus 
norvegicus 


troponin I 


70 


46 


1137 


gil 6945895 


Takifugu 
rubripes 


SUN-like 1 


70 


31 


1137 


gi|8394466|r 
ef]NP 0588 
81.1| 


Rattus 
norvegicus 


troponin I, skeletal, fast 2 


70 


46 


1140 


AA 004998 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 18890. 


277 


96 


1140 


gil9917538 


Methanosarcina 
acetivorans str. 
C2A] 

[Methanosarcina 
acetivorans C2A 


mttA/Hcfl 06 protein 


80 


28 


1140 


gi4959705 


Mus musculus 


fibuIin-2 


76 


28 


1141 


gil0141010 


Vesicular 
exanthema of 
swine virus 


non-structural polyprotein 


91 


31 


1141 


gi6566147 


Drosophila 
melanogaster 


large Forked protein 


85 


30 


1141 


gi2317953 


murid 

herpesvirus 4 


glycoprotein 150 


79 


28 


1142 


AAB54067 


Homo sapiens 


HUMA- Human pancreatic cancer 
antigen protein sequence SEQ ID 
NO:519. 


218 


56 


1142 


gil710365 


Mus musculus 


noggin 


89 


29 


1142 


gi21105761 


Equus caballus 


noggin 


89 


29 


1143 


gi|21295753| 

gb|EAA078 

98.11 


Anopheles 
gambiae str. 
PEST 


agCP1560 


69 


26 


1144 


gi505094 


Homo sapiens 


similar to an actin bundling protein, 


127 


35 
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dematn. 






1144 


gi2337952 


Homo sapiens 


actin-binding double-zinc-finger 
protein 


122 


36 


1144 


gi21304227 


Oryza sativa 


ovule development aintegumenta-like 
protein BNM3 


76 


29 


1145 


gi|21298336| 

gb|EAA104 

81.1| 


Anopheles 
gambiae str. 
PEST 


agCP2121 


68 


37 ' 


1146 


AAW22049 


Homo sapiens 


INCY- Interferon gamma inducing 
factor-2 (IGIF-2) alternate transcript 
variant. 


221 


100 


1146 


AAV05368_ 
aal 


Homo sapiens 


SCHE cDNA encoding human 
interleukin- 1 -gamma. 


167 


84 


1146 


AAH78060_ 
aal 


Homo sapiens 


STRD Nucleotide sequence of human 
interleukin 18 (IL- 18). 


167 


84 


1147 


AAY57937 


Homo sapiens 


INCY- Human transmembrane protein 
HTMPN-61. 


123 


100 


1147 


gi|20345904| 
ref]XP 1098 
23.1| 


Mus muse ul us 


similar to delta-like homolog 
(Drosophila) 


105 


86 


1148 


gil9069293 


Encephalitozoon 
cuniculi 


simOarity to ADP/ATP CARRIER 
PROTEIN 


75 , 


32 


1148 


gi8978336 


Arabidopsis 
thaliana 


contains similarity to CHP-rich zinc 
finger protein~gene id:K23F3.4 


74 


26 


1148 


gil9716318 


Aspergillus 
flavus 


antigenic cell wall protein MP1 


74 


32 


1149 


gi5456699 


Emericella 
nidulans 


ATP-binding cassette multidrug 
transport protein ATRC 


70 


35 


1149 


gi|20898840| 
ref|XP 1393 
87.1| 


Mus musculus 


similar to HSPC038 protein 


69 


31 


1150 


gi3883128 


Arabidopsis 
thaliana 


arabinogal actan-protein 


96 


32 


1150 


gil7429208 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


92 


26 


1150 


gi4063766 


Emericella 
nidulans 


chitinase 


91 


27 


1151 


gil3561058 


Homo sapiens 


dJl 108D1 1.1 (novel protein similar to 
C. elegans T22C1.7) 


107 


31 


1151 


gi21 105299 


Mytilus 

galloprovincialis 


precolIagen-NG 


105 


26 


1151 


gil4164347 


Oncorhynchus 
mykiss 


collagen a 1(1) 


96 


28 


1152 


gi 18479434 


Mus musculus 


olfactory receptor MORI 88-1 


76 


33 


1152 


gi2653915 


Oran virus 


glycoprotein Gl and G2 precursor; 
envelope glycoprotein precursor 


72 


46 


1152 


gil 8479436 


Mus musculus 


olfactory receptor MORI 88-2 


72 


33 


1153 


gi3403167 


Homo sapiens 


GBAS 


161 


86 


1153 


gjl2804791 


Homo sapiens 


glioblastoma amplified sequence 


161 


86 


1153 


AAB57149 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1727. 


134 


81 


1154 


gil7742234 


Agrobacterium 
tumefaciens str. 
C58(U. 


histidase 


87 


35 



Printed from Mimosa 05/1 1/28 15:19:04 Page: 132 



WO 03/080795 



PCT/US02/25485 



132 

Table 2 



SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 






Washington) 








1154 


gil5159496 


Agrobacterium 
tumefaciens str. 
C58 (Cereon) 


AGR_L_1400GMp 


87 


35 


1154 


gil58521 


Drosophila 
melanogaster 


seven-up protein type 2 


80 


32 


1155 


gi|10441551| 
gb|AAG170 
99.1|AF189 
115 1 


Ciyptotermes 
domesticus 


cytochrome b 


65 


28 


1156 


AAO 12089 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 25981. 


475 


98 


1156 


gi20147787 


Xenopus laevis 


nuclear receptor compressor 


74 


25 


1156 


gil9881705 


Oryza sativa 


Putative transposable element 


72 


32 


1157 * 


gi9963851 


Homo sapiens 


HT019 


80 


34 


1157 


AAB93530 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:12884. 


77 


34 


1157 


gi!040970 


Homo sapiens 


fus-like protein 


77 


42 


1158 


gi9795254 


Sepia officinalis 


GABA-A receptor beta subunit 


71 


27 


1158 


gil5026157 


Clostridium 
acetobutylicum 


amidase, germination specific 
(cwlC/cwlD B.subtilis ortholog) 


68 


34 


1158 


gi|9795254Ig 
b|AAF97816 
.11 


Sepia officinalis 


GABA-A receptor beta subunit 


71 


27 


1159 


AAB93423 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 12641. - 


336 


100 


1159 


gil3097768 


Homo sapiens 


Similar to RIKEN cDNA 2900073H19 
gene 


336 


100 


1159 


gi20071708 


Mus museums 


RIKEN cDNA 2900073H19 gene 


334 


96 


1160 


AAM72558 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 32864. 


274 


100 


1160 


AAM59959 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 32064. 


274 


100 


1161 


AAB07704 


Homo sapiens 


INMR Protein encoded by the 
endogenetic fragment of HERV-W. 


139 


36 


1161 


gi8272464 


Homo sapiens 


gag 


139 


36 


1161 


gi|5726238|g 

b|AAD4837 

5.1|AF1238 

8 Li 


multiple 

sclerosis 

associated 

retrovirus 

element 


gag polyprotein 


131 


35 


1162 


AAU25448 


Homo sapiens 


INCY- Human mddt protein from clone 
LG: 1083264. 1 :2000MAY1 9. 


346 


79 


1162 


AAU11265 


Homo sapiens 


BODE- Human zinc ringer protein 5 1 . 


319 


65 


1 1 oz 


nAljyjOj / 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18371. 


314 


67 


1163 


gil4189950 


Homo sapiens 


connexin 58 


536 


84 


1163 


gi9957542 


Homo sapiens 


connexin 59 


536 


84 


1163 


gil0946367 


Danio rerio 


connexin 55.5 


485 1 


81 


1164 


gi755700 


Bombyx mori 


sencinlB 


76 


27 


1164 


gil 9569861 


Dictyostelium 
discoideum 


RTOA protein (Ratio-A). 


76 


28 
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1164 


gil0580635 


Halobacterium 
sp. NRC-1 


Vngl087c 


76 


25 


1165 


gil9915386 


Methanosarcina 
acetivorans str. 
C2A] 

[Methanosarcina 
acetivorans C2A 


WD-domain containing protein 


89 


28 


1165 


gi5639663 


Homo sapiens 


WD repeat protein WDR3 


83 


28 


1165 


gil 1544739 


Homo sapiens 


dJ776P7.2 (WD repeat domain 3) 


83 


28 


1166 


AAM69338 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29644. 


72 


31 


1166 


AAM56953 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29058. 


72 


31 


1166 


gi20197507 


Arabidopsis 
thaliana 


expressed protein 


67 


39 


1167 


gi5802812 


Homo sapiens 


Gag protein 


83 


30 


1167 


gi7160650 


Bordetella 
bronchiseptica 


pertactin (P.68) 


79 


31 


1167 


gil3173444 


Bordetella 
bronchiseptica 


pertactin 


79 


31 


1168 


gi!495029 


Danio rerio 


protein kinase CK2 alpha 1 


84 


24 


1168 


gi643443 


Penicilliirm 
chrysogenum 


PHOG 


82 


32 


1168 


gi|18858419| 
reflNP 5713 
15.1| 


Danio rerio 


casein kinase 2 alpha 2 


84 


24 


1169 


gi206716 


Rattus 
norvegicus 


salivary proline-rich protein 


90 


31 


1169 


gil5029903 


Mus musculus 


Similar to proline-rich protein BstNI 
subfamily 2 


89 


36 


1169 


gi53182 


Mus musculus 


proline rich protein 


81 


34 


1170 


gi|17553370| 
reflNP 4983 
18.1| 


Caenorhabditis 
elegans 


F40H6.5.p 


78 


33 


1170 


gi|152 15731| 
gb|AAK914 
11.11 


Arabidopsis 
thaliana 


AT4g36780/C7A10_580 


73 


30 


1171 


gi340446 


Homo sapiens 


zinc finger protein 7 (ZFP7) 


218 


61 


1171 


AAB43928 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO:1373. 


216 


58 


1171 


AAB21040 


Homo sapiens 


INCY- Human nucleic acid-binding 
protein, NuABP-44. 


213 


48 


1172 


AAE04368 


Homo sapiens 


INCY- Human kinase (PKIN)-9. 


120 


85 


1172 


AAM79153 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1815. 


120 


85 


1172 


AAE10614 


Homo sapiens 


CURA- Human novel STE20-like 
protein, NOV-3d 


120 


85 


1173 


gi218572 


Pan troglodytes 


prot GOR 


74 


29 


1173 


gi243898 


Pan 


GOR 


74 


29 


1173 


gil 666473 


Mus musculus 


NOV protein 


71 


50 


1174 


gi590183O 


Drosophila 
melanogaster 


BcDNA.GH07910 


74 


31 
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1174 


AAM80237 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3883. 


71 


38 


1174 


ABB11528 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO: 1898, 


71 


38 


1175 


gi|12054759| 
emb|CAC20 
748.1| 


Podospora 
anseiina 


catalase A 


65 


33 


1176 


AAM93289 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 2777. 


145 


100 


1176 


gil7431512 


Ralstonia 
solanaceanun 


PUTATIVE OUTER MEMBRANE 
CHANNEL LIPOPROTEIN 
TRANSMEMBRANE 


71 


26 


1176 


gil5823991 


Streptomyces 
avermitilis 


modular polyketide synthase 


70 


51 


1177 


AAM41939 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6870. 


84 


61 


1177 


gi870751 


Homo sapiens 


N-acetylgalactosamine 6-sulfate 
sulfatase (GALNS) 


84 


61 


1177 


gi618426 


Homo sapiens 


N-acetylgalactosamine 6-sulphatase 


84 


61 


1178 


gi435855 


Mus sp. 


CREB-binding protein; CBP 


89 


22 


1178 


AAW40058 


Homo sapiens 


USSH Cellular transcriptional factor 
CBP. 


87 


22 


1178 


gil7944308 


Drosophila - 
melanogaster 


RE12101p 


86 


26 


1179 


AAM25814 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1329. 


73 


93 


1179 


AAM25290 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO:805. 


73 


93 


1179 


AAM79441 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3087. 


73 


93 


1180 


AAB88388 


Homo sapiens 


HELL Human membrane or secretory 
protein clone PSEC0131. 


719 


97 


1180 


gi20810493 


Homo sapiens 


Similar to RIKEN cDNA 2810417M05 
gene 


716 


96 


1180 


AAD30543_ 
aal 


Homo sapiens 


MILL- Human B7RP-2 DNA. 


83 


38 


1181 


ABB14686 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 3343. 


190 


97 


1181 


gil4329731 


Secale cereale 


high molecular weight glutenin subunit 

X 


88 


27 


1181 


gil4329761 


Triticum 
aestivum 


high molecular weight glutenin subunit 

X 


84 


26 


1182 


gil 1692645 


Mus mus cuius 


aspartly beta-hydroxylase 


74 


28 


1182 


gil 18781 12 


Mus musculus 


aspartyl beta-hydroxylase 6.6 kb 
transcript 


74 


28 


1182 


gill878110 


Mus musculus 


aspartyl beta-hydroxylase 4.5 kb 
transcript 


74 


28 


1183 


gil5485622 


Homo sapiens 


Q9H4T4 like 


80 


25 


1183 


gil97 14949 


Fusobacterium 
nucleatum subsp. 
nuc lea turn 
ATCC 25586 


TonB protein 


78 


32 


1183 


gi7717375 


Homo sapiens 


human CHD2-52 down syndrome cell 
adhesion molecule 


71 


23 
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1184 


AAU83667 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
152. 


388 


100 


1184 


AAG89161 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 281. 


388 


100 


1184 


AAY99348 


Homo sapiens 


GETH Human PROl 194 (UNQ607) 
amino acid sequence SEQ ID NO:29. 


388 


100 


1185 


AAB93506 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 12830. 


543 


100 


1185 


AAB87570 


Homo sapiens 


GETH Human PR01268. 


426 


95 


1185 


AAY78808 


Homo sapiens 


PROT- Hydrophobic domain 
containing protein clone HP 10537 
protein sequence. 


426 


95 


1187 


gil5823978 


Streptomyces 
avermitilis 


modular polyketide synthase 


75 


41 


1187 


AAB66657 


Homo sapiens 


HSCR- Human elastin protein without 
signal peptide. 


71 


39 


1187 


AAY69137 


Homo sapiens 


UNSY Amino acid sequence of a 
human tropoelastin derivative. 


71 


39 


1188 


gi6907090 


Oryza sativa 

(japonica 

cultivar-group) 


Similar to Oryza sativa root-specific 
RCc3 mRNA. (L27208) 


76 


30 


1188 


AAY36063 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 448. 


74 


26 


1188 


AAY35971 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 220. 


73 


26 


1189 


gi9827989 


Leishmania 
major 


possible CG12797 protein 


72 


36 


1189 


gi|13625467| 
gb|AAK350 
68.1| 


Leishmania 
donovani 


LACK protective antigen 


68 


27 


1190 


gil7027071 


Xiphocentron sp. 
UMSP00002937 
2-Costa Rica 


elongation factor- 1 alpha 


107 


27 


1190 


gi3 10665 


Strongyiocentrot 
us purpuratus 


Nf-Y-A subunit 


88 


24 


1190 


gi21743 


Triticum 
aestivum 


high molecular weight glutenin subunit 
lAxl 


86 


23 


1191 


gil6878287 


Homo sapiens 


Similar to C-terminal modulator protein 


167 


96 


1191 


gil5866714 


Homo sapiens 


(^terminal modulator protein 


167 


96 


1191 


AAO06984 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 20876. 


132 


83 


1192 


AAD05496_ 
aal 


Homo sapiens 


HUMA- Human secreted protein- 
encoding gene 5 cDNA clone 
HHBCS39, SEQlDNO:15. 


859 


100 


1192 


AAE01707 


Homo sapiens 


HUMA- Human gene 5 encoded 
secreted protein HHBCS39, SEQ ID 
NO:119. 


859 


100 


1192 


AAE01676 


Homo sapiens 


HUMA- Human gene 5 encoded 
secreted protein HHBCS39, SEQ ID 
NO:88. 


859 


100 


1193 


gil8650588 


Homo sapiens 


retinoic acid early transcript 1 


1312 


99 


1193 


AAB15540 


Homo sapiens 


INCY- Human immune system 
molecule from Incyte clone 3402252. 


1283 


97 


1193 


ABB84887 


Homo sapiens 


GETH Human PR0791 protein 


1234 


94 
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sequence SEQ ID NO: 142. 






1195 


gil 196427 


Homo sapiens 


gag 2 protein 


248 


50 


1195 


gil 780975 


Human 
endogenous 
retrovirus K 


gag protein 


248 


50 


1195 


gil556397 


Human 
endogenous 
retrovirus K 


gag 


248 


50 


1196 


gi556256 


Leishmarria 
donovani 


G protein alpha subunit 


72 


22 


1197 


AAY07237 


Homo sapiens 


ISTF Wild type monocyte chemotactic 
protein 2. 


121 


100 


1197 


AAY05300 


Homo sapiens 


ISTF C-C chemokine, MCP2. 


121 


100 


1197 


AAW42072 


Homo sapiens 


INCY- Human MC proprotein. 


121 


100 


1198 


ABB57423 


Homo sapiens 


HUMA- Human secreted protein 
encoding polypeptide SEQ ID NO 69. 


187 


79 


1198 


ABB57394 


Homo sapiens 


HUMA- Human secreted protein 
encoding polypeptide SEQ ED NO 40. 


187 


79 


1198 


AAY59757 


Homo sapiens 


META- Human normal ovarian tissue 
derived protein 34. 


187 


79 


1199 


AAY72603 


Homo sapiens 


INCY- Human Electron Transfer 
Protein, ETRN-1. 


155 


100 


1199 


AAB88465 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0259. 


155 


100 


1199 


AAE03926 


Homo sapiens 


HUMA- Human gene 29 encoded 
secreted protein HTADC63, SEQ ID 
NO:89. 


155 


100 


.1200 


gi6458884 


Deinococcus 
radiodurans 


chorismate mutase/prephenate 
dehydratase 


73 


42 


1201 


gi20803920 


Mesorhizobium 
Loti 


HYPOTHETICAL PROTEIN 


68 


32 


1201 


gi|17545158| 
ref|NP 5185 
60.1| 


Ralstonia 
solanacearum 


PUTATIVE LIPASE/ESTERASE 
PROTEIN 


66 


31 


1202 


AAM67586 


. Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27892. 


69 


30 


1202 


AAM55191 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 27296. 


69 


30 


1202 


gi849219 


Saccharomyces 
cerevisiae 


Prolp: Glutamate 5-kinase (Swiss Prot. 
accession number P32264) 


69 


33 


1203 


gil8676554 


Homo sapiens 


FLJ00174 protein 


269 


84 


1203 


gi|20913341| 
reflXP 1267 
63.1| " 


Mus musculus 


similar to FLJ00174 protein 


125 


81 


1203 


gi|20850247| 
refjXP 1366 
64.1| 


Mus musculus 


similar to proline-rich protein 


121 


33 


1204 


AAM68056 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 28362. 


140 


84 


1204 


AAM55676 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 


140 


84 
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NO: 27781. 






1205 


gi541624 


Drosophila 
virilis 


pdm2 


71 


39 


1205 


gi9955855 


Aspergillus 
oryzae 


RNA polymerase II largest subunit 


69 


38 


1205 


gi662296 


Rattus 
norvegicus 


MIBP1 


68 


32 


1206 


ABB50703 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 52 SEQ ID NO:65 1 . 


260 


94 


1206 


AAW88802 


Homo sapiens 


HUMA- Polypeptide fragment encoded 
by gene 52. 


260 


94 


1206 


ABB50706 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 52 SEQJD NO:654. 


143 


96 


1207 


AAM79588 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3234. 


72 


41 


1207 


AAM78604 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1266. 


72 


41 


1207 


AAB58944 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ID 652. 


72 


41 


1208 


AAE03429 


Homo sapiens 


HUMA- Human gene 3 encoded 
secreted protein HETDB76, SEQ ID 
NO: 112. 


575 


64 


1208 


gil91 10438 


Homo sapiens 


polycystin-lLl 


575 


64 


1208 ; 


AAE03463 


Homo sapiens 


HUMA- Human gene 3 encoded 
secreted protein HETDB76, SEQ ID 
NO: 146. 


185 


97 


1209 


gi6760015 


Homo sapiens 


brain protein 


1114 


85 


1209 


gil747306 


Mus musculus 


SDR2 | 


151 


31 


1209 


gi20381292 


Mus musculus 


stromal cell derived factor receptor 2 


151 


31 


1211 


gil4043211 


Homo sapiens 


Similar to RIKEN cDNA 493 1428F04 
gene 


460 


89 


1211 


gil90508 


Homo sapiens 


salivary proline-rich protein precursor 


113 


28 


1211 


gil2862320 


Homo sapiens 


WDC146 


102 


28 


1212 


AAO14407 


Homo sapiens 


FARB Human 1 1 beta-hydroxysteroid 
dehydrogenase 1-like enzyme. 


291 


63 


1212 


AAM79592 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3238. 


217 


45 


1212 


gi4581319 


Homo sapiens 


dJ28O10.3(HSDHBl (hydroxysteroid 
(1 1-beta) dehydrogenase 1) 


217 


45 j 


1213 


AAR06514 


Homo sapiens 


STRI Natural human Platelet Factor- 
4varl encoded by EcoRi fragment. 


238 


64 


1213 


gi292390 


Homo sapiens 


platelet factor 4 


238 


64 


1213 


AAZ28361_ 
aal 


Homo sapiens 


SMIK Platelet factor-4 (PF-4) 
nucleotide sequence. 


200 


56 


1214 


AAD12580. 
aal 


Homo sapiens 


SAGA Human protein having 
hydrophobic domain encoding cDNA 
clone HP10753. 


162 


82 


1214 


AAD08193_ 
aal 


Homo sapiens 


HUMA- Human secreted protein- 
encoding gene 3 cDNA clone 
HNTAC64, SEQ ID NO:13. 


162 


82 


1214 


AAD05544_ 
aal 


Homo sapiens 


HUMA- Human secreted protein- 
encoding gene 12 cDNA clone 
HNTAC64, SEQ ID NO:63. 


162 


82 
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1215 


gi21429094 


Drosophila 
melanogaster 


LD38004p 


354 


49 


1215 


gil5292155 


Drosophila 
melanogaster 


LD40717p 


354 


49 


1215 


AAG75596 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:6360. 


294 


50 


1216 


gi7248894 


Xenopus laevis 


Arg protein-tyrosine kinase 


84 


35 


1216 


gi402191 


Mus musculus 


HNF-3beta 


80 


26 


1216 


gi404764 


Mus musculus 


fork head related protein 


80 


26 


1218 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


559 


74 


1218 


AAO03505 


Homo sapiens 


HYSE- Human polypeptide SEQ ID ' 
NO 17397. 


502 


81 


1218 


AAM40991 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5922. 


467 


66 


1220 


AAO01188 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15080. 


248 


86 


1220 


AAY73334 


Homo sapiens 


INCY- HTRM clone 1805061 protein 
sequence. 


79 


35 


1220 


gi20249 


Oryza sati va 


Rt-2 


77 


32 1 


1221 


gi45 19619 


Haliotis discus 


collagen pro alpha-chain 


90 


28 


1221 


gi7380690 


Neisseria 

meningitidis 

Z2491 


UDP-N-acetylglucosamine-N- 
acetylmuramyl-(pentape 
pyrophosphoryl-undecaprenol N- 
acetylglucosamine transferase 


90 


37 


1221 


gi7225645 


Neisseria 

meningitidis 

MC58 


UDP-N-acetylglucosamine--N- 
acetylmuramyl-(pentapeptide) 
pyrophosphoryl-undecaprenol N- 
acetylglucosamine transferase 


90 


37 


1222 


ABA05334__ 
aal 


Homo sapiens 


MILL- Human fucosyltransferase 
family member 32132 coding 
sequence. 


2154 


99 


1222 


AAM47905 


Homo sapiens 


MILL- Human fucosyltransferase 
family member 32132. 


2154 


99 


1222 


ABA05333_ 
aal 


Homo sapiens 


MILL- Human fucosyltransferase 
family member 32132 encoding cDNA. 


2154 


99 


1223 


AAY21852 


Homo sapiens 


INCY- Human signal peptide- 
contianing protein (SIGP) (clone ED 
2652271). 


150 


100 


1223 


AAY48563 


Homo sapiens 


MET A- Human breast tumour- 
associated protein 24. 


150 


100 


1223 


AAW75103 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 47 clone HMCBP63. 


150 


100 


1224 


AAM67078 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27384. 


517 


99 


1224 


AAM54676 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26781. 


517 


99 


1224 


gil7467358 


Sus scrofa 


MIF2 suppressor 


184 


80 


1225 


gi9454237 


Cochliobohis 
sativus 


DNA binding protein MAT-1 


73 


30 


1225 


gi2 1428792 


Drosophila 
melanogaster 


GH03582p 


72 


38 
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ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


1225 


gi6633838 


Arabidopsis 
thaliana 


F2K11.15 


70 


31 


1226 


ei21430124 


Drosophila 


HL01222p 


76 


28 
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CCA 

ID 

INU: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 








656. 






1235 


A ATT1 

AAU18012 


Homo sapiens 


TTT Tfc M k T T * 1 t ■% • 

HUMA- Human immunoglobulin 
polypeptide SEQ ID No 157. 


178 


83 


1235 


ABB89226 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 1602. 


78 


82 


1236 


gi 10566951 


Rattus 
norvegicus 


s-gicerin/MUC18 


85 


45 


1 OK 


gi 10566949 


Rattus 
norvegicus 


I-gicenn/MUCl8 


85 


45 






Homo sapiens 


NOJI/ Human shear stress-response 
protein SEQ ID NO: 96. 


84 


42 


IZio 


gLdl404jUU 


Drosophila 
melanogaster 


GH20068p 


95 


36 




gu&ooo/y 


Xenopus laevis 


Zic-related-2 


88 


35 




gilo41 /jo 


Mus musculus 


GATA-5 cardiac transcription factor 


87 


52 


1239 


gil7946266 


Drosophila 
melanogaster 


RE61793p 


96 


40 


1239 


gil5636898 


Gallus gallus 


formin binding protein 1 1 -related 
protein 


91 


27 


1239 


gi780454 


African swine 
fever virus 


pB407L 


88 


30 


1240 


AAE05302 


Homo sapiens 


MILL- Human TANGO 457 protein. 


1331 


100 


1240 


AAE05303 


Homo sapiens 


MILL- Human mature TANGO 457 
protein. 


1207 


100 


1240 


AAE05305 


Homo sapiens 


MILL- Human TANGO 457 protein 
cytoplasmic domain. 


1201 


100 


1241 


gi5640111 


Lycopersicon 
esculentum 


RAD23 protein 


84 


25 


1241 


gil7131739 


Nostoc sp. PCC 
7120 


polyketide synthase type I 


76 


33 


1241 


gi|5640111|e 
mb|CAB515 
44.1| 


Lycopersicon 
esculentum 


RAD23 protein 


84 


25 


1242 


AAG03496 


Homo sapiens 


GEST Human secreted protein, SEQ ED 
NO: 7577. 


67 


39 


1242 


gi|13876270| 
gb[AAK260 
55.1| 


Mus musculus 


protocadherin alpha 8 


66 


35 


1243 


AAE16665 


Homo sapiens 


MILL- Human calcium channel family 
member, 21784 protein. 


196 


87 


1243 


AAB62248 


Homo sapiens 


WARN Human calcium channel 
alpha2delta subunit. 


196 


87 


1243 


AAY92320 


Homo sapiens 


WARN Human arpha-2-dclta-C 
calcium channel subunit polypeptide. 


196 


87 


1244 


gi|4102990|g 

b|AAD0163 

7.1| 


Aspergillus 
nidulans 


DNA polymerase epsilon homolog 


70 


30 


1245 


gi5917666 


Zea mays 


extensin-like protein 


94 


26 


1245 


gil9481644 


shrimp white 
spot syndrome 
virus 


WSSV052 


89 


36 


1245 


gil7016928 


shrimp white 
spot syndrome 
virus 


wsvOOl 


89 


36 
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ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 


1246 


AA012623 


Homo sapiens 


HYSE- Human polypepuae briv 
NO 26515. 


107 




1246 


AA012822 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 267 14. 


153 


75 


1246 


AAO02255 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 16147. 


123 


65 


1247 


gil653353 


Synechocystis 
sp. PCC 6803 


nodulation protein 


75 


28 


1247 


gi4468626 


Mus museums 


TEF-5 


1A 
/4 


76 


1247 


gi!7430764 


Ralstonia 
solanacearum 


SKWP PROTEIN 5 


"7/1 


71 


1248 


gil 5 139973 


Sinorhizobium 
meliloti 


CONSERVED HYPO! Hb 1 lOAL 
PROTEIN 


*7"7 




1249 


gi7191078 


Leishmania 
major 


L712.2 




7Q 


1249 


©17384256 


Homo sapiens 


mucin 5 




51 


1249 


©5821153 


Homo sapiens 


RNA binding protein 


83 


33 


1250 


AAY36495 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 27. 


124 


86 


1250 


AA012122 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26014. 


123 


91 


1250 


AAB95063 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:16901. 


121 


90 


1252 


gi|15839838| 
reflNP 3348 
75.1| 


Mycobacterium 

tuberculosis 

CDC1551 


membrane protein, MmpL family 


68 


27 


1254 


AAG00399 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4480. 


328 


100 


1254 


gi21428466 


Drosophila 
melanogaster 


LD22609p 


85 


24 


1254 


gil9914274 


Methanosarcina 
acetivorans str. 
C2A 


sensory transduction histidine kinase 
[Memanosarcina 


85. 


26 


1256 


gil4161094 


Choloepus 
didactylus 


von Willebrand Factor 


80 


24 


1256 


gil4161092 


Cyclopes 
didactylus 


von Willebrand Factor 


78 


23 


1256 


gil 3872552 


Acomys 
cahirinus 


von Willebrand Factor 


"7*7 

77 


15 \ 


1258 


gi7008025 


Callithrix 
jacchus 


prochymosin 


715 


64 


1258 


gil 1990126 


Camelus 
dromedarius 


chymosin 


634 


57 


1258 


gi491952 


synthetic 
construct 


preprochymosin 


618 


DO 


1259 


gi[21402709| 
rCI(JNJr 00 oO 
94.1| 


Bacillus 


AMP-binding, AMP -binding enzyme 

PRjirilliiQ flnthra^iR 


72 


34 


1260 


gi|4505431|r 
ef[NP 0025 
10.1| 


Homo sapiens 


nuclear protein, ataxia-telangiectasia 
locus; NPAT gene; E14 gene 


64 


33 


1260 


gi|15309894| 
refpOP 0408 
46.2| 


Homo sapiens 


similar to nuclear protein, ataxia- 
telangiectasia locus; NPAT gene; E14 
gene 


64 


33 
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ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


/o 

Identity 


1260 


gi|1304114|d 
bj|BAA1186 
1.11 


Homo sapiens 


NPAT 


64 


33 


1261 


gi4519535 


Homo sapiens 


7—* i 

Leukotriene B4 omega-hydroxylase 


i j j 


40 


1261 


gil857022 


Homo sapiens 


leukotriene B4 omega-hydroxylase 


133 


49 


1261 


gil8266446 


Homo sapiens 


cytochrome P450, subfamily IVr, 
polypeptide 2 


i ii 




1262 


gil3363530 


Escherichia coli 
0157:H7 


cell division protein HflB/FtsH 
protease 


79 


26 


1262 


gi746401 


Escherichia coli 


ATP -binding protein 


^0 

/y 


ZO 


1262 


gil46028 


Escherichia coli 


ffcsH 


79 


26 


1263 


AAW67859 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 53 clone HBMCL41. 


283 


100 


1264 


gil 1066248 


Helix lucorum 


presenilin 


85 


21 


1264 


gi|191 15422) 
reflNP 5945 
10.11 


Schizosaccharom 
yces pombe 


ribonuclease II RNB family protein; 
dis3-like 


69 


30 


1264 


gi|14720912| 
refJXP 0382 
04.1| 


Homo sapiens 


similar to Matrin 3 


69 


32 


1265 


gi5757703 


Mus musculus 


syntrophin-associated serine-threonine 
protein kinase 


82 


38 


1265 


gi4996035 


Human 
herpesvirus 6 


69.8% identical to U47 gene of strain 
U1102ofHHV-6 


76 


42 


1265 


gi330951 


GaUid 

herpesvirus 1 


ICP4 


76 


36 


1266 


gi|1751 1177| 
reflNP 4933 
24.1| 


Caenorhabditis 
elegans 


ZK1053.3.p 


75 


40 


1266 


gi|17538077| 
reflNP 4951 
59.11 


Caenorhabditis 
elegans 


ZK12482.p 


69 


34 


1267 


gi915540 


Ovis aries 


pregnancy-specific antigen 


85 


25 


1267 


gi6179989 


Capra hircus 


pregnancy-associated glycoprotein-2 


84 


25 


1267 


gi9798658 


RMnolophus 
ferrumequinum 


pepsinogen A 


80 


23 


1268 


gi|15789526| 
reflNP 2793 
50.1| 


Halobacterium 
sp.NRC-1 


serine proteinase; HtrA 


69 


30 


1269 


gi9988674 


Influenza A virus 
(A/Swine/Wisco 
nsin/14094/99(H 
3N2)) 


hemagglutinin protein . 


70 


24 


1269 


gi6552676 


Influenza A virus 
(A/Bangkok/1/97 
(H3N2)) 


hemagglutinin 


70 


25 


1269 


gio55263o 


Influenza A virus 
(A/Trinidad/51/9 
6(H3N2)) 


hemagglutinin 


70 


24 


1270 


gi3378527 


Zea mays 


anther specific protein 


87 


41 


1270 


AAW15787 


Homo sapiens 


PENN- Human metastasis suppressor 
KiSS-1. 


85 


28 


1270 


gi21410770 


Homo sapiens 


Similar to RIKEN cDNA 1500005K14 
gene 


84 


46 
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NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


1271 


gil335527 


Human 
poliovirus 1 


reading frame VP3 


75 


38 


1271 


gi61253 


Human 
poliovirus 1 


polyprotein 


75 


38 


1271 


gi|17453412| 
refpCP 0631 
32.1| 


Homo sapiens 


similar to 60S ribosomal protein L7A 
(Surfeit locus protein 3) 


76 


40 


1272 


AAU87081 


Homo sapiens 


BRIM Sialic acid-binding Ig-related 
lectin, Siglec-lt. 


69 


43 


1272 


AAU87077 


Homo sapiens 


BRIM Sialic acid-binding Ig-related 
lectin, Siglec-BMS-L3cL 


69 


43 


1272 


AAU87076 


Homo sapiens 


BRIM Sialic acid-binding Ig-related 
lectin, Siglec-BMS-L3c. 


69 


43 


1273 


AAA09121_ 
aal 


Homo sapiens 


CURA- Clone 2355875 cDNA 
(update), encodes syncollin homologue. 


720 


100 


1273 


AAY92233 


Homo sapiens 


CURA- Clone 2355875f - syncollin 
homologue. 


720 


100 


1273 


AAB54267 


Homo sapiens 


HUMA- Human pancreatic cancer 
antigen protein sequence SEQ ID 
NO:719. 


715 


100 


1274 


gil5559064 


Mus musculus 


SNAG1 


198 


59 


1274 


AAU17435 


Homo sapiens 


HUMA- Novel signal transduction 
pathwayjprotein, Seq ID 1000. 


131 


62 


1274 


AAW99023 


Homo sapiens 


MOUN 17G2 peptide sequence. 


131 


62 


1275 


gi|6753732|r 
eflNP 0342 
43.1! 


Mus musculus 


epidermal growth factor 


65 


30 


1275 


gi|50801|em 
b|CAA2411 
5.1| 


Mus musculus 


polyprotein 


65 


30 


1275 


gi|20341089| 
ref|XP 1093 
85.11 


Mus musculus 


epidermal growth factor 


65 


30 


1276 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


447 


78 


1276 


AAM40991 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5922. 


424 


74 


1276 


AAO07159 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 21051. 


401 


75 


1277 


gil3905120 


Mus musculus 


RDCEN cDNA 0610013117 gene 


134 


35 


1277 


gi!3936283 


Mus musculus 


TRH3 


134 


35 


1277 


AAB92625 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:10921. 


127 


35 


1279 


AAM66940 


Homo sapiens 


MOLE- Human bone maTrow 
expressed probe encoded protein SEQ 
ID NO: 27246. 


362 


85 


1279 


AAM54534 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26639. 


J Ox 


OJ 


1279 


gi|208153|gb 
(AAA73184. 


synthetic 
construct 


crystal toxin 


79 


40 


1280 


AAE05187 


Homo sapiens 


INCY- Human drug metabolising 
enzyme (DME-18) protein. 


484 


100 
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No. 
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Score 


0/ 

vo 

Identity 


1280 


AAU12266 


Homo sapiens 


GETH Human PRO5780 polypeptide 
sequence. 


A QA 




1280 


AAY91631 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 24 SEQ ID 
NO:304. 


484 


100 


1281 


AAH46856_ 
aal 


Homo sapiens 


HUMA- Human serine/threonine 
phosphatase encoding cDNA (clone ID 
HLDOO20). 


238 


1 AA 


1281 


AAG77801 


Homo sapiens 


HUMA- Human HLDOO20 
serine/threonine phosphatase protein 
sequence. 


238 


100 


1281 


AAB85476 


Homo sapiens 


HUMA- Human serine/threonine 
phosphatase (clone ID HLDOO20). 


238 


100 


1282 


gi|14762786| 
ref|XP 0478 
71.1| 


Homo sapiens 


GS2 gene 


70 


30 


1283 


gi3860165 


Arabidopsis 
tha liana 


disease resistance protein RPPl-WsB 


69 


38 


1283 


AAO09033 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22925. 


68 


38 


1283 


gi6967115 


Arabidopsis 
thaliana 


disease resistance protein homlog 


68 


38 


1285 


gil055252 


Rattus 
norvegicus 


pheroraone receptor VN5 


78 


32 


1285 


gi2746733 


Drosophila 
virilis 


circadian clock protein 


73 


26 


1285 


gi2641617 


Drosophila 
virilis 


TIM 


73 


26 


1286 


gi6013135 


Rattus 
norvegicus 


coxsackie-adenovirus-receptor 
homolog 


86 


67 


1286 


AAV50429_ 
aal 


Homo sapiens 


UYNY Human coxsackievirus and Ad2 
and Ad5 receptor (HCAR) cDNA. 


83 


I J 


1286 


AAV28845_ 
aal 


Homo sapiens 


DAND Human coxsackievirus and 
adenovirus receptor encoding DNA. 


83 


75 


1287 


AAU83224 


Homo sapiens 


ZYMO Novel secreted protein 
Z930757G12P. 


042 


lw 


1287 


AAY70692 


Homo sapiens 


DAND Human soluble attractin-2. 


OA 

84 


ZA 


1287 


AAY70691 


Homo sapiens 


DAND Human membrane attractin-2. 


84 


54 


1288 


AAW70326 


Homo sapiens 


GEMY Secretedjprotein DU123 1. 


1655 


99 


1288 


ABB 12473 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 312. 


547 


72 


1288 


gi5689736 


Homo sapiens 


Myopodin protein 


475 


i An 
100 


1289 


gi4103543 


Tomato chlorosis 
virus 


heat shock protein 70 


73 


29 


1289 


gil2247413 


Cristatella 
mucedo 


cytochrome b 


72 


30 


1289 


gi|4103543|g 
b|AAD0179 
0.1| 


Tomato chlorosis 
virus 




73 


29 


1291 


AAB94128 


Homo sapiens 


HELI- Human protein sequence SEQ 
TDNO:14383. 


520 


98 


1291 


AAY85576 


Homo sapiens 


JANC Hs-UNC-53/1 fragment/GFP 
fusion insert of plasmid pGI3150. 


520 


98 


1291 


AAY85564 


Homo sapiens 


JANC Human homologue of UNC-53 


520 


98 
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m 

NO: 
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No. 


Species 


Description 


Score 


0/ 
/o 

Identity 








(Hs-UNC-53/1) sequence. 






1292 


AAY01413 


Homo sapiens 


HUMA- Secreted protein encoded by 
gene 31 clone HHBAG64. 


207 


97 


1292 


AAY05324 


Homo sapiens 


GEMY Human secreted protein 
ij!67 5. 


207 


97 


1292 


gil5157864 


Agrobacterium 
tumefaciens str. 
C58 (Cereon) 


AGR_C_4816p 


71 


34 


1294 


AAB12146 


Homo sapiens 


PROT- Hydrophobic domain protein 
from clone HP10672 isolated from 
Thymus cells. 


219 


100 


1295 


gi|17228767| 
ref|NP 4853 
15.1| 


Nostoc sp. PCC 
7120 


probable glycogen phosphorylase 


78 


34 


1295 


gi|10835203| 
ref]NP 0011 
27.1| 


Homo sapiens 


advanced glycosylation end product- 
specific receptor 


65 


58 


1295 


gi|190846|gb 
[AAA03574. 

1| 


Homo sapiens 


receptor for advanced glycosylation 
end products 


65 


CO 

58 


1296 


gil7511816 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10032022 
gene 


1268 


99 


1296 


AAB88440 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0222. 


688 


100 


1296 


gi7211438 


Homo sapiens 


golgin-67 


94 


30 


1298 


gil83 14436 


Homo sapiens 


Similar to RIKEN cDNA 492151 1C04 
gene 


481 


79 


1298 


gil872546 


Mus musculus 


NIK 


86 


25 


1298 


gi5533305 


Homo sapiens 


somatostatin receptor interacting 
protein splice variant a 


85 


29 


1299 


gil334643 


Xenopns laevis 


APEG precursor protein 


105 


27 


1299 


gil7428053 


Ralstonia 
solanacearum 


PROBABLE RIBONUCLEASE E 
(RNASE E) PROTEIN 


100 


32 


1299 


gi6690017 


Herpesvirus 
papio 


NTR 


96 


25 


1300 


AAB87346 


Homo sapiens 


HUMA- Human gene 5 encoded 
secreted protein HDPIE85, SEQ ID 
NO:87. 


586 


/4 


1300 


AAB44298 


Homo sapiens 


GETH Human PRO/06 (UNQ370) 
protem sequence SEQ ID NO:3o5. 


3oO 


/*f 


1300 


AAY41742 


Homo sapiens 


GETH Human PRO/06 protem 
sequence. 


JoO 


74 


1301 


gi2 18572 


Pan troglodytes 


prot GOR 


1 1AA 


OZ 


1301 


gi243898 


Pan 


GOR 


1040 


68 


1301 


gil7862570 


Drosophila 
melanogaster 


LD38414p 


486 


4j 


1302 


giiiZ/ojyo 


Homo sapiens 


HT61404 7 fNnvel nroteiii^ 


260 


28 


1302 


gil3397804 


Homo sapiens 


dJ616B8.3 (novel gene) 


230 


30 


1302 


AAB56641 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1219. 


226 


30 


1303 


gi603989 


Drosophila 
melanogaster 


salivary gland glue protein 


149 


23 


1303 


gi 13324584 


Borrelia 
burgdorferi 


LMP1 


129 


17 
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0/ 

Identity 


1303 


gil61956 


Trypanosoma 
cruzi 


surface antigen 




1 1 


1304 


gil3569248 


Human 

irnmunodeficienc 
y virus type 1 


gag protein 


O 1 

81 


34 


1304 


gi4324832 


Human 

irnmunodeficienc 
y virus type 1 


gag-pol polyprotein 
• 


80 


29 


1304 


gil 1691875 


Mus musculus 


ADP-ribosylation factor 1 GTPase 
activating protein 


79 


oo 
22 


1305 


AAO06469 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 20361. 


191 


100 


1305 


gi3608368 


Xenopus laevis 


origin recognition complex associated 
protein p81 


69 


30 


1305 


ABB15196 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ H> NO 3853. 


68 


36 


1306 


AAE03657 


Homo sapiens 


INCY- Human extracellular matrix and 
cell adhesion molecule-21 (XMAD- 
21). 


109 


27 


1306 


ABB 11890 


Homo sapiens 


HYSE- Human protocadherin 
Flamingo 1 homologue, SEQ ID 
NO:2260. 


109 


27 


1306 


gi3449298 


Homo sapiens 


MEGF2 


109 


27 


1308 


gi9294050 


Arabidopsis 
thaliana 


protein kinase-like protein 


84 


32 


1308 


gil5983765 


Arabidopsis 
thaliana 


AT3g24550/MOB24_8 


84 


32 


1308 


gil3877617 


Arabidopsis 
thaliana 


protein kinase-like protein 


84 


32 


1309 


AAU00375 


Homo sapiens 


BERN/ Human stem cell growth factor 
receptor. 


127 


54 


1309 


AAE07145 


Homo sapiens 


SALK Human Kit/stem cell factor 
receptor kinase insert region. 


127 


54 


1309 


gi3236223 


Equus caballus 


tyrosine kinase receptor homolog 


127 


50 


1310 


gi2 1449343 


Actinosynnema 
pretiosum subsp. 
auranticum 


polyketide synthase 


77 


46 


1310 


gi21114513 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


transcriptional regulator 


75 


36 


1310 . 


gil3364364 


Escherichia coli 
0157:H7 


acetylglutamate kinase 


73 


36 


1311 


gi20146220 


Qryza sativa 

(japonica 

cultivar-group) 


similar to splicing factor/activator 
protein 


1 1 A 

110 


33 


131 1 


gi206712 


Rattus 
norvegicus 


salivary proline-rich protein 




27 


1311 


AAY84592 


Homo sapiens 


UNIW Amino acid sequennce of a 
human artemin polypeptide. 


103 


34 


1312 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


530 


69 


1312 


gi|10834720| 
gb|AAG237 
90.11AF258 


Homo sapiens 


PP565 


249 


66 
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587 1 










1312 


gi|13 194728) 
gb[AAK155 
26.1|AF329 
451 1 


G alius gallus 


pol-like protein ENS-3 


115 


21 


1313 


AAW03515 


Homo sapiens 


orlisJ riuman DUUKloU protem. 




CO 
JO 


1313 


gil339910 


Homo sapiens 


DOCK1 80 protein 


147 


58 


1313 


gu 504002 


Homo sapiens 


similar to a human major CKK-binding 
protein DOCK1 80. 


1 1 1 
ill 


A'l 


1314 


gil2007418 


Mus musculus 


B3 olfactory receptor 


/o 


Jo 


1314 


gil8480290 


Mus musculus 


olfactory receptor MOR260-3 


76 


38 


1314 


gil2007432 


Mus musculus 


B3 olfactory receptor 


76 


38 


1315 


gi483581 


Mus musculus 


Notch 3 


82 


26 


1315 


gil8159668 


Pyrobaculum 
aerophilum 


paREP2b 


81 


29 


1315 


gi4584086 


Spermatozopsis 
similis 


p2 10 protein 


79 


25 


1316 


AAM71305 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31611. 


422 


98 


1316 


AAM58790 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30895. 


422 


98 


1316 


gil49490 


Lactococcus 
lactis 


sucrose-6-phosphate hydrolase 


72 


31 


1317 


gil 620040 


Paramecium 
bursaria 

Chlorella virus 1 


Asp-rich 


72 


28 


1317 


gi3721615 


Cyprinus carpio 


MEF2C 


71 


25 


1317 


gi|9631936|r 
ef|NP 0487 
25.1| 


Paramecium 
bursaria 

Chlorella virus 1 


Asp-rich 


72 


28 


1318 


gil21291797| 

gb|EAA039 

42.1| 


Anopheles 
gambiae str. 
PEST 


agCP3974 


74 


35 


1319 


gi21306283 


Chlamydomonas 
reinhardtii 


iron transporter Ftrl 


74 


30 


1319 


AAB60461 


Homo sapiens 


INCY- Human cell cycle and 
proliferation protein CCYPR-9, SEQ 
IDNO:9. 


73 


33 


1319 


gi6013155 


Homo sapiens 


p35srj 


73 


33 


1320 


gi9717245 


Mus musculus 


cytoplasmic dynein heavy chain 


430 


94 


1320 


gi402528 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


430 


94 


1320 


gi294543 


Rattus 
norvegicus 


dynein heavy chain 


430 


94 


1323 


gi|17221411| 
emb|CAD12 
639.1| 


Burkholdena 
cepacia 


kdo transferase 


70 


J4 


1324 


gil698601 


Cricetulus 
griseus 


beta-l,6-N- 

acetylglucosaminyltransferase 


440 


38 


1324 


gi349091 


Rattus 
norvegicus 


N-acetylglucosaminyltransferase V 


438 


43 


1324 


gil8997007 


Mus musculus 


N-acerylglucosaininyltransferase V 


438 


43 
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1325 


AAM70545 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 30851. 


115 


47 


1325 


AAM58098 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30203. 


115 


47 


1325 


AAM72994 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 33300. 


111 


28 


1326 


gil2724969 


Lactococcus 
lactis subsp. 
lactis 


phenolic acid decarboxylase 


77 


46 


1327 


AAB53097 


Homo sapiens 


GETH Human angiogenesis-associated 
protein PR01246, SEQ ID NO:167. 


372 


63 


1327 


AAU12416 


Homo sapiens 


GETH Human PR01246 polypeptide 
sequence. 


372 


63 


1327 


AAY99377 


Homo sapiens 


GETH Human PR01246 (UNQ630) 
amino acid sequence SEQ ID NO:l 32. 


372 


63 


1328 


gi6014505 


Hepatitis GB 
virus B 


polyprotein 


76 


43 


1328 


gi765145 


Hepatitis GB 
virus B 


polypeptide 


68 


41 


1328 


gi|20544059| 
reflXP 0862 
20.4| 


Homo sapiens 


similar to U4/U6-associated RNA 
splicing factor 


294 


100 


1329 


AAV42689_ 
aal 


Homo sapiens 


SIBI- DNA encoding human calcium 
channel alpha-2 subunit. 


158 


91 


1329 


AAQ84667_ 
aal 


Homo sapiens 


SALK Human neuronal calcium 
channel subunit alpha 2c. 


158 


91 


1329 


AAQ84664_ 
aal 


Homo sapiens 


SALK Human neuronal calcium 
channel subunit alpha 2b. 


158 


91 


1330 


gil9923 


Nicotiana 
tabacum 


pistil extensin like protein, partial CDS 


71 


38 


1330 


gi|144429|gb 
|AAA56792. 


Cellulomonas 
fimi 


beta-l,4-xylanase 


67 


30 ! 


1331 


gi2388676 


Mytilus edulis 


precollagen P 


85 


35 


1331 


gil7862044 


Drosopbila 
melanogaster 


LD06016p 


75 


30 


1331 


gil3879780 


Mycobacterium 

tuberculosis 

CDC1551 


PE_PGRS family protein 


74 


30 


1333 


AAO00015 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 13907. 


442 


61 


1333 


AAB82479 


Homo sapiens 


ZYMO Human RING finger protein 
Zapop2. 


81 


31 


1333 


gi20975274 


Homo sapiens 


skeletrophin 




n 
j i 


1334 


ABB11819 


Homo sapiens 


HYSE- Human secreted protein 
hornologue, SEQ ID NO:2189. 


367 


82 


1334 


AAW80398 


Homo sapiens 


GEMY A secreted protein encoded by 
clone cwl543 3. 


130 


67 


1334 


gi5081693 


Samanea saman 


pulvinus inward-rectifying channel 
SPICK2 


70 


34 


1335 


ABB89969 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 


142 


96 
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No. 
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0/ 
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NO 2345. 






1335 


AAB38385 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 18 clone HTLEJ24. 


142 


96 


1335 


AAB38338 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 18 clone HTLFE57. 


142 


96 


1336 


gi|14590195| 
refJNP 1422 
60.1| 


Pyrococcus 
horikosbii 


asparaginyl-tRNA synthetase 


70 


37 


1337 


gi3879419 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF00102 (Protein-tyrosine 
phosphatase), Score=51.6, E- 
value=1.8e-14,N=l 


69 


29 


1337 


gi|17563828| 
ref|NP 5059 
65.11 


Caenorhabditis 
elegans 


protein tyrosine phosphatase 


69 


29 


1338 


gi|2072960|g 

b|AAC5126 

8.11 


Homo sapiens 


p40 


138 


33 


1338 


gi|4185940|e 
mb|CAA768 
80.1| 


Human 
endogenous 
retrovirus K 


env protein 


124 


75 


1338 


gi|757872|e 

rab|CAA577 

23.1| 


Human 

endogenous 

retrovirus 


env 


124 


75 


1340 


gil491979 


MoUuscum 
contagiosum 
virus subtype 1 


MC036R 


78 


33 


1340 


gi|962896S(r 
ef]NP 0439 
87.1] 


Molluscum 

contagiosum 

virus 


MC036R 


78 


33 


1341 


gil8676514 


Homo sapiens 


FLT00154 protein 


1560 


100 


1341 


AAB84252 


Homo sapiens 


HUMA- Amino acid sequence of a 
human cytokine receptor-like protein. 


572 


63 


1341 


AAB84251 


Homo sapiens 


HUMA- Human cytokine receptor-like 
protein fragment. 


572 


63 


1342 


AAY27757 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene No. 47. 


152 


71 


1342 


AAB27551 


Homo sapiens 


MYRI- Human tumour suppressor 
BRG1 encoded by cDNA mutated at 
base 1705. 


77 


32 


1342 


AAB27550 


Homo sapiens 


MYRI- Human tumour suppressor 
BRG1 protein from cell lines DU145 
andNCI-H1300. 


77 


32 


1344 


gi21464394 


Drosophila 
melanogaster 


RE18651p 


78 


26 


1344 


AAM39065 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NU ZzW. 


77 


21 


1344 


gi338290 


Homo sapiens 


son3 protein 


77 


21 


1345 


gi2202 


Canissp. 


Clox 


135 


37 


1345 


gi3879551 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF01391 (Collagen triple helix repeat 
(20 copies)), Score=56.4, E-vahie=2e- 
13, N=2; PF01484 (Nematode cuticle 
collagen N-terminal domain), 


125 


33 
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Score=87.2, &-value=l.le-22, N— 1 






1345 


gil58695 


Drosophila 
melanogaster 


tropomyosm isoForm 33 (9C) 


1 1 Q 
115 


in 


1346 


gi7862077 


Giardia 
intestmalis 


3-hydroxy-3-methylglutaryl-coenzyme 
A reductase 


y\j 


OA 


1346 


gil098615 


Mycoplasma 
pneumoniae 


adhesin-related 30 kDa protein 


Of 


TI 

£5 


1346 


gi20380058 


Homo sapiens 


Similar to PRAM-1 protein 


OA 


Zo 


1347 


gil3905302 


Mus musculus 


Similar to ATPase, class II, type 9A 


736 


85 


1347 


gil7862322 


Drosophila 
melanogaster 


LD22119p 


633 


72 


1347 


AAM25271 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:786. 


572 


1 Af> 


1348 


gi456319 


Bacteriophage 
FC1 


74kDa protein 


75 


33 


1348 


gil524115 


Lycopersicon 
esculentum 


subtilisin-like endoprotease 


73 


28 


1348 


gi4200334 


Lycopersicon 
esculentum 


P69A protein 


73 


28 


1349 


gi21391988 


Drosophila 
melanogaster 


HL08052p 


78 


31 


1349 


gi20148339 


Arabidopsis 
thaliarta 


cyclin delta-3 


77 


25 


1349 


gi|17647607| 
refINP_5234 
23.1| 


Drosophila 
melanogaster 


maroon-like; bronzy; section 5 


78 


31 


1351 


gil 8676524 


Homo sapiens 


FLJ00159 protein 


164 


52 


1351 


gi21392066 


Drosophila 
melanogaster 


RE04357p 


139 


34 


1351 


AAB92637 


Homo sapiens 


HELI- Human protein sequence SEQ 
E)NO:10953. 


81 


43 


1352 


gil9071965 


Aspergillus 
oryzae 


chitin synthase 


79 


28 


1352 


gil7945592 


Drosophila 
melanogaster 


RE26660p 


78 


41 


1352 


gil6184663 


Drosophila 
melanogaster 


LD28370p 


74 


22 


1353 


gi|11037117| 
gb|AAG274 
85.1|AF194 
537 1 


Homo sapiens 


NAG 13 


307 


65 


1353 


gi|1335205|e 
mb|CAA364 
80.1| 


Homo sapiens 


ORFH 


305 


65 


1354 


gil388166 


Drosophila 
melanogaster 


Bowel 


80 


32 


1 1 SA 


" i ccc? 1 OT 
gll3JD3lo/ 


Scyliorhinus 
canicula 


— : — — 

homeodomain protein Otxl 


70 

ly 


01 


1354 


AAY85573 


Homo sapiens 


JANC Hs-UNC-53/3 fragment/GFP 
fusion insert of plasmid pGI3303. 


78 


26 


1358 


gi|21288288| 

gb[EAA006 

09.1| 


Anopheles 
gambiae str. 
PEST 


agCP9766 


71 


30 


1358 


gi|17465558| 


Homo sapiens 


similar to mucin 


68 


36 
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reffXP 0698 
88.11 










1359 


gi|21302892| 

gb|EAA150 

37.1| 


Anopheles 
gambiae str. 
PEST 


agCP5020 


70 


31 


1361 


gil5080686 


Lentinula edodes 


CDC5 


79 


26 


1361 


gi495516 


Plasmodium 
vivax 


circumsporozoite protein 


77 


31 


1361 


gi21 070569 


Dictyostelium 
discoideum 


VSAE2 (FRAGMENT). 3/101 


76 


31 


1362 


gi8953400 


Arabidopsis 
thaliana 


1-D-deoxyxylulose 5-phosphate 
synthase-like protein 


73 


23 


1362 ; 


gi|15239030| 
ref|NP 1966 
99.1| 


Arabidopsis 
thaliana 


1-D-deoxyxylulose 5-phosphate 
synthase • like protein 


73 


23 


1363 


gi2444430 


Xenopus laevis 


deacetylase 


327 


81 


1363 


gi602098 


Xenopus laevis 


yeast RPD3 homologue 


324 


80 


1363 


AAB49954 


Homo sapiens 


METH- Human histone deacetylase 
HDAC-1. 


323 


80 


1364 


AAM69686 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29992. 


418 


55 


1364 


AAM57281 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29386. 


418 


55 


1364 


gi|1780971|e 
mb|CAA714 
16.1| 


Human 
endogenous 
retrovirus K 


gag protein 


172 


37 


1365 


gi437084 


Gallus gallus 


vitamin D3 hydroxylase associated 
protein 


510 


41 


1365 


gi2149156 


Homo sapiens 


fatty acid amide hydrolase 


477 


38 


1365 


AAW57783 


Homo sapiens 


SCRI Human fatty acid amide 
hydrolase. 


468 


38 


1366 


gi3510695 


Homo sapiens 


DNA polymerase theta 


77 


21 


1366 


gi309132 


Mus musculus 


calnexin 


72 


22 


1366 


gil5214567 


Mus musculus 


Similar to calnexin 


72 


22 


1367 


gi|17508849| 
ref]NP 4914 
26.1| 


Caenorhabditis 
elegans 


helicase 


73 


40 


1368 


gi5457567 


Pyrococcus 
abyssi 


Na+/H+ antiporter (napA-1) 


76 


33 


1368 


gi8247211 


Candida albicans 


She9 protein 


69 


31 1 


1368 


gi| 14590079] 
reflNP 1421 
43.1| 


Pyrococcus 
horikoshii 


Na(+)/H(+) antiporter 


76 


30 


1369 


gil7644260 


Homo sapiens 


DB206I21.1 (ATPase, Class VI, type 
11C) 


305 


98 


1369 


AAO14200 


Homo sapiens 


INCY- Human transporter and ion 
channel TRICH-17. 


166 


50 


1369 


gi5080816 


Arabidopsis 
thaliana 


Putative ATPase 


166 


49 


1370 


gi|18573281| 
refpCP 0959 
33.1| 


Homo sapiens 


similar to 40S ribosomal protein S3 A 


70 


38 
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1372 


gi6683562 


Mus musculus 


heparan sulfate 6-sulfotransferase 3 


886 


91 


1372 


gi6683558 


Mus musculus 


heparan sulfate 6-sulfotransferase 2 


265 


72 


1372 


ABL39900_ 
aal 


Homo sapiens 


SEGK Human HS6ST2v encoding 

Li^iNA orSV^ LU JNU:1. 


262 


71 


1373 


gi|20882231| 
ref|XP 1392 
03.1| 


Mus musculus 


similar to L1M domain only 7 


76 


24 


1373 


48.1|AF498 
989 1 


ivieaicago sauva 


nodule-specific glycine-rich. protein 3 


72 


26 


1373 


gi|9965267|g 
b|AAG1000 
8 1! 

O. 1 j 


infectious 
hypodermal and 

liCULLa lUpUlC UL 

nRrm<5i<2 vitiiq 


non-structural protein 2 


72 


24 


1374 


gi3355835 


Rhizobium etli 


RBSK 


78 


32 


1374 




a viiydugiuixi 
cellulosum 


epoD 


73 


28 


1374 




Schizosaccharom 
yces pombe 


similar to Saccharomyces cerevisiae 
porphobilinogen dearninase, SWISS- 
rKU X Accession Number P28789 


72 


28 


1375 


gi 16973455 


Danio rerio 


beta-3-galactosyltransferase 


1050 


63 


1375 




Homo sapiens 


Ob I xi Human PR04397 protein 
sequence SEQ ID NO:42. 


725 


46 


1375 


AARR8404 


nornc) sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0159. 


709 


43 


1376 


gi7668 


JL/I UaUpillia 

melanogaster 


osgzju protem 


73 


33 


1376 


ei20177037 


melanogaster 


IAJ/L 1 oH^Jp 


73 


33 


1376 


ril353669 


elegans 




69 


43 


1379 


AAS16182 
aal 


TTVytyi/\ canipnc 


vjujna- xiuman apoupoprotern CI 
(APOCl)DNA. 


245 


67 


1379 


AAU10534 


Hntnfi cani^no 


OJiJNA- xiuman apolipoprotem CI 
(atuli ; polypeptide. 


245 


67 


1379 


AAS16825_ 
aal 


Homo sapiens 


GENA- Human apolipoprotein CI 
^/Lrvj^i ; jlijn a cooing sequence. 


245 | 


67 


1380 


AAY36290 


Homo sapiens 


HUMA- Human secreted protein 

f*T\r*r\Af*f\ \\\r nana /»T 

cnuuucu oy gene o / . 


177 


74 


1380 


gil6551305 


Tatianyx 


DNA-directed RNA polymerase beta' 
suuunii z 


71 


38 


1380 


gi34110l3 


Candida albicans 


protein mannosyltransferase 1 


68 


35 


1381 




xiurou sapiens 


xl i L^xL- Human protem SEQ ID NO 
3778. 


173 


66 


1381 


ci4731867 


u it» iyui> Lcuuui 

discoideum 


sterol giucosyt trans terase 


107 


30 


1381 


AAB74726 


Homo sapiens 


INCY- Human membrane associated 
protein MEMAP-32. 


89 


41 


1382 


AAB62100 


Homo sapiens 


WIST- Human bridging integrator-2 
(Bin2) protein. 


78 


27 


1382 


gi6527168 


Homo sapiens 


breast cancer associated protein 
BRAP1 


78 


27 


1382 


gi5852834 


Homo sapiens 


bridging integrator-2 


78 


27 
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1383 


gi7670050 


Xenopus laevis 


type I collagen alpha 1 


92 


27 


1383 


AAO01606 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15498. 


85 


29 


1383 


gil7738485 


Agrobacterium 
tumefaciens str. 
C58 (U. 
Washington) 


biopolymer transport protein 


85 


28 


1384 


gi20451261 


Caenorhabditis 
elegans 


C. elegans GCY- 17 protein 
(corresponding sequence W03F1 1 .2) 


71 


26 


1384 


gi2665714 


Agrobacterium 
tumefaciens 


moaC 


71 


29 


1384 


gi(20864452| 
refpCP 1500 
76.1| 


Mus musculus 


RJKEN cDNA 2410018E23 


130 


59 


1385 


AAY94938 


Homo sapiens 


GEMY Human secreted protein clone 
ye78 1 protein sequence SEQ ID 
NO:82. 


103 


25 


1385 


gil2831176 


Agelaius 
phoeniceus 


gamma filamin protein 


96 


29 


1385 


AAU81998 


Homo sapiens 


INCY- Human secreted protein 
SECP24. 


87 


27 


1386 


gil0440468 


Homo sapiens 


FLJ00070 protein 


102 


41 


1386 


gilll36912 


Danio rerio 


RPTP-alpha protein 


94 


32 


1386 


gi20377083 


Homo sapiens 


p78 


92 


36 


1387 


AAM40810 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5741. 


190 


59 


1387 


AAM39024 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2169. 


190 


59 


1387 


gil 5080474 


Homo sapiens 


Similar to RIKEN cDNA 170002301 1 
gene 


190 


59 


1388 


gil 2802591 


Bovine 
herpesvirus 4 


tegument protein 


82 


30 


1388 


gi950226 


Saccharomyces 
cerevisiae 


Trf4p 


73 


26 


1388 


gi|13095641| 
reflNP 0765 
56.1| 


Bovine 
herpesvirus 4 


tegument protein 


82 


30 


1389 


AAI67224_ 
aal 


Homo sapiens 


CORI- B51 IS cDNA sequence. 


363 


100 


1389 


AAF85500_ 
aal 


Homo sapiens 


EOSB- Nucleotide sequence of a 
human breast cancer protein designated 
BCH1. 


363 


100 


1389 


AAA54120_ 
aal 


Homo sapiens 


EOSB- Breast cancer protein BCH1 
coding sequence. 


363 


100 


1390 


gil84653 


Homo sapiens 


UN-alpha responsive transcription 
factor 


74 


30 


1390 


gi|2580453|g 

b|AAB8233 

6.1| 


Xenopus laevis 


Xbap 


68 


47 


1391 


AAB88456 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0246. 


85 


52 


1391 


AAB62392 


Homo sapiens 


LEXI- Human LDL receptor family 
protein (LDLP). 


85 


52 


1392 


ABB 12009 


Homo sapiens 


HYSE- Human RAMP1 homoloEue, 


90 


100 
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NO: 
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No. 
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Score 


% 

Tflonfi'fv 

roenniy 








SEQ ID NO:2379. 






1392 


RB171910 


Homo sapiens 


RAMP1 


90 


100 


1392 


gil2653551 


Homo sapiens 


receptor (calcitonin) activity modifying 
protein 1 


90 


1 An 


1394 


gi4467343 


Drosophila 
melanogaster 


EG:140G11.1 


70 


Li 


1394 


gi6018879 


Drosophila 
melanogaster 


BACN4L24.d 


70 


27 


1394 


gil 57993 


Drosophila 
melanogaster 


developmental protein 


70 


27 


1395 


gi4928919 


Arabidopsis 
thaliana 


zinc finger protein 2 


86 


26 


1395 


gi2702272 


Arabidopsis 
thaliana 


expressed protein 


86 


26 


1396 


AAM25276 


Homo sapiens 


HYSE- Human protein sequence SEQ 
EDNO:791. 


729 


93 


1396 


AAE14340 


Homo sapiens 


INCY- Human protease PRTS-5 
protein. 


528 


33 


1396 


AAB47561 


Homo sapiens 


INCY- Protease PRTS-3. 


528 


33 


1397 


gil 8369843 


Infectious 
salmon anemia 
virus 


P6 


89 


40 


1397 


gi4092530 

- 


Infectious 
salmon anemia 
virus 


NS1 protein 


87 


39 


1397 


gil4009648 


Infectious 
salmon anemia 
virus 


NS1 


87 


39 


1398 


AAW63707 


Homo sapiens 


UYOR- Human hSK2 protein. 


331 


91 


1398 


gil575663 


Rattus v 
norvegicus 


calcium-activated potassium channel 
rSK2 


331 


91 


1398 


gil5082148 


Homo sapiens 


small-conductance calcium-activated 
potassium channel 


331 


91 


1399 


AAB01381 


Homo sapiens 


INCY- Neuron-associated protein. 


1653 


68 


1399 


gil8157547 


Mus muscuhis 


pecanex-like 3 


1620 


66 


1399 


gi6650377 


Mus musculus 


pecanex 1 


1277 


51 


1400 


gi!20887681| 
ref|XP 1405 
75.1| 


Mus musculus 


similar to meiastatin 1 


468 


91 


1400 


gi|3243075|g 

b[AAC8000 

0.1| 


Homo sapiens 


melastatin 1 


355 


75 


1400 


gi|20552333| 
reflXP 0076 
62.9| 


Homo sapiens 


similar to melastatin 1 


355 


75 


1401 


AAU15955 


Homo sapiens 


HUMA- Human novel secreted protein, 

Qart TT% OAS 

oeq lu wo. 


931 


92 


1401 


gi3978441 


Homo sapiens 


PITSLRE protein kinase alpha SV9 
isoform 


95 


24 


1401 


gil517914 


Homo sapiens 


monocytic leukaemia zinc ringer 
protein 


91 


28 


1402 


gil289326 


Mus musculus 


ROR-alpha 1 


84 


25 


1402 


gi530878 


Chlamydomonas 
eugametos 


amino acid feature: N-glycosylation 
sites, aa41 43, 46 48, 51 .. 53, 72 .. 


79 


32 ! 
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0/ 

/o 

Identity 








74, 107 109, 128 .. 130, 132 .. 134, 
158 .. 160, 163 .. 165; amino acid 
feature: Rod protein domain, aa 169 .. 
340; amino acid feature: globular 
protein domain, aa 32 168 






1402 


gi220763 


Rattus 
norvegicus 


HES-3 factor 


79 




1403 


gi|20479430| 
reflXP 1149 
55.1| 1 


Homo sapiens 


similar to olfactory receptor MOR23 1- 
1 


71 


JZ 


1403 


gi|20480897| 
reflXP_1150 
14.1| 


Homo sapiens 


similar to olfactory receptor MOR234- 
3 


71 


32 


1404 


AAA88548_ 
aal 


Homo sapiens 


SM1K Human CAooolo cjjina. 


RQ 


l \>\J 


1404 


AAB19591 


Homo sapiens 


SMIK Human CASB6 16. 


QO 
oy 


1UU 


1404 


gil 100110 


Homo sapiens 


protein-tyrosine kinase 


oy 




1405 


gi4206753 


Oryctolagus 
cuniculus 


homeodomam-containing protein 


74 


24 


1405 


gil3445253 


Mus musculus 


orphan Gpr37-like protein 1 


72 


33 


1405 


K i3080552 


Mus musculus 


Hoxa-9 


71 


50 


1406 


AAM50585 


Homo sapiens 


NISB Benign prostatic hyperplasia 
associated protein JT460914. 


325 


100 


1406 


gi!8031947 


Homo sapiens 


SOCS box protein ASB-5 


325 


100 


1406 


AAU20593 


Homo sapiens 


HUMA- Human secreted protein, Seq 
ID No 585. 


316 


100 


1407 


AAU83222 


Homo sapiens 


ZYMO Novel secreted protein 
Z930005G2P. 


895 


97 


1407 


AAY02712 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 63 clone HBJFV28. 


91 


56 


1407 


AAO00641 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 14533. 


86 


64 


1408 


ABB17944 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6601. 


81 


53 


1408 


AAM77906 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 38212. 


72 


40 


1408 


AAM65199 

• 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded pro tern SEQ ID 
NO: 37304. 


72 


40 


1409 


gi5230847 


Vitreoscilla sp. 
CI 


glutamine synthetase homolog 


CO 

08 


11 

3D 


1409 


gi8515736 


Drosophila 
melanogaster 


highwire 


0/ 


1^ 
JJ 


1409 


gi3138797 


Sulfolobus 
shibatae 


Ssh7b 


65 


48 


141U 


a a won no 


nUIUU bapicilo 


EIJI- Human Werner's syndrome WS-2 
protein. 


151 


96 


1410 


gil913785 


Homo sapiens 


Rep-8 


151 


96 


1410 


gi 18089098 


Homo sapiens 


reproduction 8 


151 


96 


1411 


gi|21297468| 

gb|EAA096 

13.1| 


Anopheles 
gambiae str. 
PEST 


agCP15537 


166 


56 


1411 


gi|20983200| 


Mus musculus 


RIKENcDNA 1810030007 


73 


24 
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Score 


% 
loennty 




ref|XP 1358 
12.1| 










1412 


gi532572 


Hordeum 
vulgare 


lipoxygenase 1 


oZ 


Zo 


1412 


gi945419 


Mus musculus 


hepatoma derived growth factor 
(HDGF) 


77 


35 


1412 


gil7932895 


stork hepatitis B 
virus 


preC/core antigen 


77 


26 


1413 


gi2370143 


Homo sapiens 


irranunoglobulin-like domain- 
containing 1 ] 


169 


42 | 


1413 


gi2645890 


Homo sapiens 


IGSF1 


169 


42 


1413 


AAB40232 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 46 SEQ ID 
NO: 142. 


162 


40 


1414 


gi21204314 


Staphylococcus 
aureus subsp. 
aureus MW2 


proline-tRNA ligase 


78 


32 


1414 


gil4247033 


Staphylococcus 
aureus subsp. 
aureus Mu50 


proline-tRNA ligase 


78 


32 


1414 


gil3701063 


Staphylococcus 
aureus subsp. 
aureus N3 15 


proline-tRNA ligase 


78 


32 


1415 


gi9948469 


Pseudomonas 
aeruginosa 


probable non-ribosomal peptide 
synthetase 


78 


31 


1415 


AAE19251 


Homo sapiens 


BIOI- SOS1 protein sequence from 
PS462. 


75 


23 


1415 


AAU84311 


Homo sapiens 


BAAK/ Protein ABCB2 differentially 
expressed in breast cancer tissue. 


74 


30 


1416 


gil8676710 


Homo sapiens 


FLJ00254 protein 


623 


75 


1416 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


583 


69 


1416 


gi|18676710| 
dbj|BAB850 
07.1| 


Homo sapiens 


FLJ00254 protein 

• 


623 


75 


1417 


AAR85785 


Homo sapiens 


UYNY Human GRB-10. 


77 


32 


1417 


gi841210 


Mus musculus 


growth factor receptor binding protein 
GrblO 


77 


32 


1417 


AAM90963 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 18556. 


74 


32 


1419 


AAM79990 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3636. 


82 


1 AA 
100 


1419 


AAM79006 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1668. 


82 


1 AA 
100 


1419 


AAR28494 


Homo sapiens 


XIAM/ Sequence encoded by the 
CAMPATH-1 antigen cDNA. 


82 


1 AA 
100 


1420 


AAU01383 


Homo sapiens 


\AJT T Unman TAWfJO AQQ frvrm 9 

jyllLtL^- xiuman i/vinow **yy luiiu z, 
variant 1 amino acid sequence. 


828 


73 


1420 


AAU01382 


Homo sapiens 


MILL- Human TANGO 499 form 2, 
variant 4 amino acid sequence. 


828 


73 


1420 


AAU01380 


Homo sapiens 


MILL- Human TANGO 499 form 2, 
amino acid sequence. 


828 


73 


1421 


gil9069609 


Encephalitozoon 
cuniculi 


PROTEASOME REGULATORY 
SUBUNIT YTA6 OF THE AAA 


76 


26 
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Score 


% 
Identity 








FAMILY OF ATPASES 






1422 


AAM66177 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26483. 


199 


72 


1422 


AAM53791 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein bliQ ID 
NO: 25896. 


199 


72 


1422 


AAM68472 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 28778. 


176 


81 


1423 


gil 800227 


Oryza saliva 


Bowman-Birk proteinase inhibitor 


/4 


1A 

34 


1423 


gil0141005 


San Miguel sea 
lion virus 


non-structural polyprotein 


74 


26 


1423 


gi|17490177| 
reffXP 0623 
00.11 


Homo sapiens 


similar to RING finger protein 18 
(Testis-specific ring-finger protein) 


76 


oo 
28 


1424 


gi461336 


Pyrenomonas 
salina 


hsp70 


75 


29 


1424 


gil3880037 


Mycobacterium 

tuberculosis 

CDC1551 


membrane protein, MmpL family 


75 


24 


1424 


gil449306 


Mycobacterium 

tuberculosis 

H37Rv 


mmpL2 


75 


24 


1425 


gil5600 


Enterobacteria 
phage T7 


gene 7.3, host range 


79 


OA 

30 


1425 


gil6198065 


Drosophila 
melanogaster 


LD28477p 


77 


OA 

3U 


1425 


gil 1870012 


Drosophila 
melanogaster 


xnp/atr-x DNA helicase 


77 


OA 

3U 


1426 


gil6185397 


Drosophila 
melanogaster 


LD39815p 


OA/1 

204 


A A 

44 


1426 


gi2244793 


Arabidopsis 
thaliana 


disease resistance N like protein 


86 


*> A 

30 


1426 


AAU84280 


Homo sapiens 


BGHM Human endometrial cancer 
related protein, HERCL 


77 


26 


1427 


AAY36302 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 79. 


183 


79 


1427 


AAB88359 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0087. 


178 


80 


1427 


AAM41635 


Homo sapiens 


HYSE- Human polypeptide SEQ ED 
NO 6566. 


178 


80 


1428 


AAU82008 


Homo sapiens 


INCY- Human secreted protein 
SECP34. 


114 


64 


1428 


AAB32391 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 21 SEQ ID 

MO-77 


114 


64 


1428 


AAY08306 


Homo sapiens 


F1BR- Human collagen IX alpha-3 
chain protein. 


74 


45 


1429 


gi2792523 


Ralstonia 
solanacearum 


alternative RNA sigma Victor RpoS 


69 


30 


1429 


gil7428221 


Ralstonia 
solanacearum 


RNA POLYMERASE SIGMA S 
(SIGMA-38) FACTOR 
TRANSCRIPTION REGULATOR 


69 


33 
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Score 


% 
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PROTEIN 






1429 


gi]5032313|r 
ef|NP 0040 
14.1| 


Homo sapiens 


dystrophin Dpl40bc isoform; 
Dystrophin (muscular dystrophy, 
Duchenne and Becker types) 


73 


26 


1433 


gi9954445 


Rattus 
norvegicus 


TEMO 


171 


62 


1433 


gil4030260 


maize ray a do 
fino virus 


polyprotein 


79 


32 


1433 


AAB95656 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18419. 


77 


36 


1434 


AAR04212 


Homo sapiens 


CALB- Human 32K alveolar surfactant 
protein. 


391 


43 


1434 


AAP60661 


Homo sapiens 


KUSH/ Genomic sequence of human 
alveolar surfactant protein 
(hASP)encoded by genomic DNA. 


386 


43 


1434 


AAB58135 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 473. 


366 


42 


1435 


gil7224904 


Mus museums 


immunoglobulin superfamily member 9 


180 


48 


1435 


gi20988778 


Homo sapiens 


Similar to immunoglobulin 
superfamily, member 9 


173 


53 


1435 


gil4149050 


Drosophila 
melanogaster 


turtle protein, isoform 4 


114 


36 


1436 


gil465855 


Caenorhabditis 
elegans 


C. elegans PQN-57 protein 
(corresponding sequence R09F10.7) 


85 


23 


1436 


gil465856 


Caenorbabditis 
elegans 


C. elegans PQN-56 protein 
(corresponding sequence R09F10.2) 


85 


23 


1436 


gil7864717 


Mus mus cuius 


home riii 


83 


26 


1437 


gi]21292574| 

gb|EAA047 

19.1| 


Anopheles 
gambiae str. 
PEST 


agCP3449 


66 


33 


1438 


ABB10160 


Homo sapiens 


HUMA- Human cDNA SEQ ID NO: 
468. 


166 


62 


1438 


gi9657279 


Vibrio cholerae 


aspartokinase II/homoserine 
dehydrogenase, metMonine-sensitive 


71 


28 


1439 


gi4582571 


Gallus gallus 


Hyperion protein, 4 1 9 kD isoform 


75 


24 


1439 


gil3165 


Oenothera 
biennis 


ATPase alpha-subunit (aa 1-5 1 1) 


72 


26 


1439 


gi903838 


Oenothera 
berteriana 


F-l-ATPase alpha subunit 


72 


26 


1440 


gi4558758 


Homo sapiens 


testis-specific chromodomain Y-like 
protein 


233 


62 


1440 


gi4558762 


Mus mus cuius 


testis-specific chromodomain Y-like 
protein 


231 


36 


1440 


gi3342716 


Homo sapiens 


testis-specific ChromoDomain Y 
isoform 1 


195 


36 


1441 


gil55627 


Acanth amoeba 
castellanii 


myosin I heavy chain 


118 


42 


1441 


gil3093370 


Mycobacterium 
leprae 


initiation factor IF-2 


116 


33 


1441 


AAY20289 


Homo sapiens 


UYRO- Human apolipoprotein E 
mutant protein fragment 5. 


114 


39 


1442 


gi2253707 


Mus museums 


Daxx 


84 


36 


1442 


gi!934970 


Plasmodium 
falciparum 


AARP1 protein 


79 


65 



Printed from Mimosa 05/1 1/28 15:52:32 Page: 159 



WO 03/080795 



PCT/US02/25485 



159 
Table 2 



SEQ 
ID 
NO: 


Accession 
No. 


Species 
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0/ 

Identity 


1442 


gi4050098 


Mus musculus 


Fas-binding protein 


78 


34 


1443 


gi2425111 


Dictyostelium 
discoideum 


ZipA 


90 


26 


1443 


AAY06119 


Homo sapiens 


HARD Human C1ITA interacting 
protein 104 (OP104). 


88 


26 


1443 


gi5420387 


Leishmania 
major 


proteophosphoglycan 


86 


21 


1444 


gi893355 


Acinetobacter 
baumanmi 


L-2,4-diaminobutyrate decarboxylase 


77 


26 


1445 


ABB55744 


Homo sapiens 


FECH/ Human polypeptide SEQ ID 
NO 94. 


135 


47 


1445 


AAU39035 


Homo sapiens 


GEMY Human secreted protein 
nh328 5. 


135 


47 


1445 


AAY28679 


Homo sapiens 


GEMY Human nh328_5 secreted 
protein. 


135 


47 


1446 


gil9744390 


Homo sapiens 


retinoic acid inducible in 
neuroblastoma cells RAINBld 


247 


54 


1446 


gil9744388 


Homo sapiens 


retinoic acid inducible in 
neuroblastoma cells RAINB1 


247 


54 


1446 


AAY85565 


Homo sapiens 


JANC Human homologue of UNC-53 
(Hs-UNC-53/2) sequence. 


240 


52 


1447 


AAU19716 


Homo sapiens 


HUMA- Human novel extracellular 
matrix protein, Seq ID No 366. 


71 


31 


1447 


gil8025476 


cercopithicine 
herpesvirus 15 


BPLF1 


71 


38 


1447 


AAS14575_ 
aal 


Homo sapiens 


MILL- Human cDNA encoding G 
protein-coupled receptor, GPCR, 
52872. 


69 


62 


1448 


gil4027507 


Mesorhizobium 
loti 


salicylate hydroxylase 


69 


31 


1449 


AAG64798 


Homo sapiens 


SREH- Human peptide methionine 
sulphoxide reductase (hPMSR). 


192 


.71 


1449 


AAB81893 


Homo sapiens 


SEQU- Human genomic database 
relatedprotein SEQ ID NO: 38. 


192 


71 


1449 


AAM42046 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6977. 


192 


71 


1450 


gi!8249657 


Mus musculus 


NC8 


1063 


80 


1450 


gi406748 


Mus musculus 


zinc finger protein 


250 


37 


1450 


AAB43498 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO:943. 


249 


37 


1451 


ABB89331 


Homo sapiens 


HUMA- Human polypeptide SEQ ED 
NO 1707. 


732 


88 


1451 


gil3421927 


Caulobacter 
crescentus CB15 


MaoC family protein 


273 


42 


1451 


gil9338616 


Methylobacteriu 
m extorquens 


R-specific enoyl-CoA hydratase 


261 


44 


1452 


gi|20908171| 
ref|XP 1397 
15.1| ~ 


Mus musculus 


»l a XT A T""»T»T T ' •) . XT A TXTITT 

similar to NADPH oxidase 3; NADFH 
oxidase catalytic subunit-like 3 


Oo 


in 


1452 


gi|17533619| 
re£[NP 4955 
16.1| 


Caenorhabditis 
elegans 


F32A5.8.p 


67 


42 


1453 


gi|15614051| 
refpSP 2423 


Bacillus 
halodurans 


sodium-dependent phosphate 
transporter 


65 


34 
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54.1| 










1454 


gi|17551878| 
ref|NP 4990 
90.1| 


Caenorhabditis 
elegans 


TPR Domain 


1& 


29 


1455 


AAM40727 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5658. 


191 


56 


1455 


AAM38941 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2086. 


191 


56 


1455 


gil9702127 


Homo sapiens 


P-Rexl protein 


191 


56 


1456 ! 


ABB05666 


Homo sapiens 


GEHU- Human nucleic acid 
management protein clone amy2 1 ln4. 


496 


91 


1456 l 


AAE03372 


Homo sapiens 


HUMA- Human gene 18 encoded 
secreted protein fragment, SEQ ID 
NO: 152. 


496 


91 


1456 | 


AAE03371 


Homo sapiens 


HUMA- Human gene 18 encoded 
secreted protein fragment, SEQ ID 
NO: 150. 


496 


91 


1457 


AAM66940 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27246. 


290 


77 


1457 


AAM54534 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26639. 


290 


77 


1457 


AAM64410 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 36515. 


287 


77 


1458 


AAB53445 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein sequence SEQ ID NO:985 . 


335 


100 


1458 


AAY30055 


Homo sapiens 


ARIA- Amino acid sequence of a 
FK506-bmding protein (FKBP). 


165 


91 


1458 


AAQ52277_ 
aal 


Homo sapiens 


VERT- FK506 binding protein 
(FKBP12A) cDNA. 


159 


100 


1460 


AAU20255 


Homo sapiens 


HUMA- Human novel endocrine 
antigen, SEQ ID No 3 12. 


104 


76 


1460 


ABB17663 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6320. 


94 


77 


1460 


AAO02331 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 16223. 


88 


61 


1461 


AAM65951 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26257. 


97 


57 


1461 


AAM53568 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25673. 


97 


57 


1461 


AAU83199 


Homo sapiens 


ZYMO Novel secreted protein 
Z891639G1P. 


96 


38 


1463 


gi5565687 


Homo sapiens 


topoisomerase-related function protein 


CI/1 


/ J 


1463 


gi5 139669 


Homo sapiens 


LAK-1 


468 


75 


1463 


gi2 1430468 


Drosophila 
melanogaster 


LP06848p 


332 


51 


1464 


AAY91421 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 7 SEQ ID 
NO: 142. 


109 


35 


1464 


AAY91396 


Homo sapiens 


HUMA- Human secreted protein 


109 


35 . 
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sequence encoded by gene 7 SEQ ID 
JNO:ll /. 






1464 


AAY91352 


Homo sapiens 


xiUJYLA- riuman secreico. prutcm 
sequence encoded by gene 7 SEQ ID 
JNU./j. 


109 


35 


1465 


AAU15978 


Homo sapiens 


HUMA- Human novel secreted protein, 
SeqID931. 


575 


100 


1465 


AAU15958 


Homo sapiens 


riUiVIA- xiuman novel sectcicu. piutcm, 
SeqID911. 


575 


100 


1465 


gil6041675 


Homo sapiens 


joined to JAZbl 


575 


100 


1466 


AAO01502 


Homo sapiens 


H Yob- Human poiypepnae oc\i id 
NO 15394. 


173 


66 


1466 


gi|l0947038| 
ref|NP__0652 
09.1| 


Homo sapiens 


ankyrin 1, isofonn 1; ankyrin-1, 
erythrocytic; ankyrin-R 


74 


28 


1466 


gi|10947036| 
reffNP 0652 
08.1| 


Homo sapiens 


anJcynn l, lsoiorm 4, anKynn-i, 
erythrocytic j ankynn-R 


74 


28 


1467 


gil9354550 


Mus musculus 


similar to src nomoiogy inrce 
and cysteine rich domain 


842 


91 


1467 


AAU17352 


Homo sapiens 


HUMA- Novel signal rxansaucuon 
patnway protein, oeq id y i /. 


j \j i 


98 


1467 


gil799566 


Mus musculus 1 


stac 




44 


1468 


gil3506771 


Mus musculus 


structural protein FBF1 


161 


74 


1468 


gi7549210 


Babesia 
bigenrina 


f\ f\f\ \jr\— »»T C\f\ 

200 kDa antigen pzUU 


on 


29 


1468 


gil747 


Oryctolagus 
cuniculus 


— • — v~~ " 

tncnonyalin 


191 


30 


1469 


gi 11345048 


Homo sapiens 


O^/VTN UOiriaJIl~^Ulllalllillg pi.wi.tui *- 


86 


32 


1469 


fiil 1320940 


Homo sapiens 


SCAND2 


86 


32 


1469 


gil4210722 


Tupaia 
herpesvirus 


t41 


86 


30 


1470 


AAY88278 


Homo sapiens 


MILL- Human TANGO 188 protein. 


1442 


100 


1470 


gi!4336711 


Homo sapiens 


similar to L. blegans protein r i /^o.j 


144? 


100 


1470 


AAA39947_ 
aal 


Homo sapiens 


MILL- Human 1 AINoU loo cjluna. 


14^8 

ItJO 


99 


1471 


AAE10204 


Homo sapiens 


HYSE- Human bone marrow derived 
contig protein, SEQ ID NO: 69. 


71 


44 


1471 


AAA23458_ 
aal 


Homo sapiens 


AT T5TJ aHWA pnr>n^i'no tinman 
A-UJrJtl- GUANA cnCOUUlg LLUilkUi 

secreted protein vpl5_l, SEQ ID 
NO:71. 


67 


46 


1471 


AAB80228 


Homo sapiens 


nUTTI Unmin PTJfY5£Q rvTrvfpin 

\je, i ri riuman r tssjLoy pro ie ill 


67 


46 


1472 


AAB88433 


Homo sapiens 


HELI- Human membrane or secretory 
protem cione r ohlam i v. 


136 


86 


1472 


AAB95155 


Homo sapiens 


HELI- Human protem sequence SEQ 

ID INU.l /lOO. 


136 


86 




AAP01745 


Homo saoiens 


HUMA- Human gene 2 encoded 
secreted protein HOGCS52 variant, 
SEQ ID NO: 160. 


136 


86 


1473 


gi9294201 


Arabidopsis 
thaliana 


disease resistance protein 


70 


24 


1474 


AAE19157 


Homo sapiens 


THOR/ Human kinase polypeptide 
(PKJN-15). 


631 


98 


1474 


AAM79131 


Homo sapiens 


HYSE- Human protein SEQ ID NO 


494 


72 
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1793, 






1474 


AAW19920 


Homo sapiens 


REGC Human Ksr' (kinase suppressor 
oi Ras). 


494 


72 


1475 


AAD12609_ 
aal 


Homo sapiens 


SAGA Human protein having 
hydrophobic domain encoding cDNA 
clone HP03974. 


657 


73 


1475 


AA014199 


Homo sapiens 


INCY- Human transporter and ion 
channel TRICH-16. 


657 


73 


1475 


AAE06614 


Homo sapiens 


SAGA Human protein having 
hydrophobic domain, HP03974. 


657 


73 


1476 


gil3905246 


Mus mus cuius 


RJKEN cDNA 2410024K20 gene 


71 


34 


1476 


gi|17505208| 
rer|NP_0816 
29. 1| 


Mus mus cuius 


CD2 antigen (cytoplasmic tail) bmding 
protem2; 150001 lB02Rik 


71 


34 


1477 


gi806491 


Rattus 
norvegicus 


guanyiyl cyclase 


140 


65 


1477 


gi2648066 


Canis familiaris 


guanylate cyclase E 


118 


55 


1477 


gi2623074 


Bos taurus 


rod outer segment guanylate cyclase 
precursor 


116 


55 


1478 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


585 


73 


1478 


gil8676710 


Homo sapiens 


FLJ00254 protein 


408 


69 


1478 


AAO04042 . 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 17934. 


392 


75 


1479 


AAU05396 


Homo sapiens 


GEHO Human tirin (connectin) protein 
sequence. 


208 


29 


1479 | 


gil212992 


Homo sapiens 


Protein sequence and annotation 
available soon via Swiss-Prot; available 
at present via e-mail from 
LABEIT@EMBL-HeideIberg.DE 


208 


29 


1479 


gil7066l05 


Homo sapiens 


Titin 


208 


29 


1480 


AAV44685_ 
aal 


Homo sapiens 


TEXA Osteoclast mhibitor protein, 
OEP-1, coding sequence. 


94 


41 


1480 


AAB35287 


Homo sapiens 


UROG- Human stem call antigen-2. 


94 


41 


1480 


AAY99709 


Homo sapiens 


REGC Human stem cell antigen-2, 
hSCA-2. 


94 


41 


1481 


AAB57094 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1672. 


122 


100 


1481 ! 


gi32672 


Homo sapiens 


interferon alpha/beta receptor 


122 


100 


1481 


AAQ49625_ 
aal 


Homo sapiens 


EUBI- Human interferon receptor 
extracellular domain coding sequence. 


118 


96 


1482 


AAD17516_ 
aal 


Homo sapiens 


SENO- Human taste receptor, hTlRl 
cDNA coding sequence. 


890 


94 


1482 


ABB77319 


Homo sapiens 


INCY- Human G-protem coupled 
receptor SEQ ID NO 3. 


890 


94 


1482 


AAE10372 


Homo sapiens 


SENO- Human taste receptor, hTlRl 
protein. 


890 


94 




giJaj (oil/. 


Neurospora 
crassa 


ralntarl fn CCA 1 nrnfot'n 

related to aoUi protein 


1 AO 


1Q 

jy 


1483 


gi2645173 


Schizosaccharom 
ycespombe 


sts5+ 


99 


42 


1483 


gi2459997 


Candida albicans 


protein phosphatase Ssdl homolog 


99 


40 


1484 


gi|18569064| 
ref|XP 0953 
78.1| 


Homo sapiens 


similar to 40S RIBOSOMAL 
PROTEIN S3A(V-FOS 
TRANSFORMATION EFFECTOR 


319 


96 
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PROTEIN) 






1484 


gi|20539276| 
ref|XP 0952 
20.2| 


Homo sapiens 


similar to olfactory receptor MOR145- 
2 


259 


94 


1484 


gi|2 1295882| 

gb|EAA080 

27.1| 


Anopheles 
gambiae str. 
PEST 


agCP1347 


68 


32 


1485 


ABB 11761 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO:2131. 


197 


36 


1485 


gi930259 


Woolly monkey 
sarcoma virus 


reverse transcnptase (476 AA) 


148 


33 


1485 


gil 8076262 


porcine 

endogenous 

retrovirus 


Pol protein 


147 


38 


1486 


AAM74887 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35193. 


172 


100 


1486 


AAM62085 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 34190. 


172 


100 


1486 


gil52661 


PlasmidpSB24.2 


neomycin resistance protein 


75 


26 


1487 


gil 2653493 


Homo sapiens 


Similar to brain acid-soluble protein 1 


75 


34 


1487 


gil7428832 


Ralstonia 
solanacearum 


PROBABLE AVRBS3-LIKE 
PROTEIN 


75 


33 


1487 


gi7329672 


Arabidopsis 
thaliana 


phosphatidate cytidylyltransferase-like 
protein 


72 


46 


1488 


AAU74754 


Homo sapiens 


INCY- Human protease PRTS-14 
protein sequence. 


2042 


83 


1488 


AAU74752 


Homo sapiens 


INCY- Human protease PRTS-12 
protein sequence. 


476 


39 


1488 


gil 1935 122 


Mus musculus 


papilin 


431 


40 


1489 


gi|17543712| 
refpSIP 4999 
76.1| 


Caenorhabditis 
elegans 


Y55F3C.8.p 


72 


32 


1489 


gi|20344600| 
ref|XP 1095 
79.1| 


Mus musculus 


RIKEN cDNA 493343 1K05 


70 


30 


1489 


gi|l 1692798] 
gb|AAG400 
02.1[AF320 
125 1 


Xenopus laevis 


ataxia telangiectasia and Rad3 -related 
protein 


69 


26 


1490 


AAB95817 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18817. 


256 


63 


1490 


ABB06369 


Homo sapiens 


BODE- Human neurogenesis related 
protein 12SEQIDNO:2. 


173 


64 


1490 


AAB44394 


Homo sapiens 


HUMA- Gene 10 encoded human 
secreted protein fragment as BLASTX 
query sequence. 


83 


66 


1491 


gi438795 


Mus musculus 


serotonin 1A receptor 


73 


26 


1491 


gi!066326 


Mus musculus 


serotoninl A receptor 


72 


26 


1491 . 


gi|438795|gb 
(AAA16850. 
11 


Mus musculus 


serotonin 1 A receptor 


73 


26 


1492 


gi!6198083 


Drosophila 


LD29875p 


87 


33 
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CnArA 
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melanogaster 








1492 


gi2327063 


Pneumocystis 
carinii £ sp. 
carinii 


protease 1 


75 


34 


1492 


gi20420 


Pninus dulcis 


extensin 


75 


34 


1493 


AAG67087 


Homo sapiens 


SHAN- Human ATP-dependent serine 
protein hydrolase 13. 


1 f\A 


Or 


1493 


AAM76636 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36942. 




oo 


1493 


AAM63822 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35927. 


1 fYX 


oo 


1494 


AAY31225 


Homo sapiens 


A VET Human KIN A neucase pi^j 
protein. 


11 
(J 


J& 


1494 


gi3123906 


Homo sapiens 


pre-mRNA splicing factor 


73 


38 


1494 


gil3278975 


Homo sapiens 


pre-mRNA splicing factor similar to S. 
cerevisiaePrpl6 


73 


38 


1495 


gi|17568307| 
refpsfP 5098 
37.1| 


Caenorhabditis 
elegans 


collagen 


74 


3d 


1496 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


410 


81 


1496 


gi|10834720| 
gb|AAG237 
90.1|AF258 
587 1 


Homo sapiens 


PP565 


301 


77 


1496 


gi|6753924|r 
ef|NPJ)343 
74.1| 


Mus musculus 


Friend virus susceptibility 1 


127 


37 


1497 


gi20901968 


Caenorhabditis 
elegans 


C. elegans RPL-36 protein 
(corresponding sequence F37C12.4) 


71 


34 


1497 


gi|17554754| 
reflNP 4985 
73.1| 


Caenorhabditis 
elegans 


Ribosomal protein YL39 


71 


34 


1498 


gi5305335 


Mycobacterium 
tuberculosis 


proline-rich mucin homolog 


102 


2.1 


1498 


gi330130 


human 
herpesvirus 1 


latency associated transcript (LAT) 
ORF-2 


yf 




1498 


AAU83682 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
182. 


CM 

94 


if\ 


1499 


AAY57937 


Homo sapiens 


INCY - Human transmembrane protein 

UTN/TDXT <1 

HlMrN-OJ. 


1 GO 


R1 


1499 


AAY36295 


Homo sapiens 


HUM A- Human secreted protein 
encoded by gene 72. 


1 ^1 


100 


1499 


AAG75708 


Homo sapiens 


huma- Human colon cancer anngen 
protein SEQ ID NO:6472. 


141 


y 


i <>oo 


cn? 14?871? 

~ I *r £. O 1 A£* 


yimcnWh i 1 a 

melanogaster 


SD05267p 


165 


54 


1500 


gi20975274 


Homo sapiens 


skeletrophin 


114 


40 


1500 


gi!9773434 


Mus musculus 


skeletrophin 


99 


52 


1501 


ABB 17830 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6487. 


82 


37 


1501 


AA012929 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26821. 


73 


43 
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1502 


gi8778340 


Arabidopsis 
thaliana 


F1504.13 


77 




1503 


AAW03515 


Homo sapiens 


SHKJ Human DOCK180 protein. 


144 


33 


1503 


©1339910 


Homo sapiens 


DOCKl 80 protein 


144 


jj 


1503 


©13195147 


Mus musculus 


HCH 


129 


25 


1505 


AAM70790 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31096. 


77 


53 


1505 


AAM58316 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 3042L 


77 ! 


53 


1505 


gi|21302711| 

gb|EAA148 

56.1| 


Anopheles 
gambiae str. 
PEST 


agCP4916 


77 


3U 


1506 


AAU75102 


Homo sapiens 


MYRI- Heat shock protein 8 (Hsp8). 


592 


79 


1506 


AAB82535 


Homo sapiens 


UYCO- Human heat shock protein 
Hsc70. 


592 


79 


1506 


AAE12987 


Homo sapiens 


SRIV/ Human Hsp70 family 
homologue, Hsc70. 


592 


79 


1507 


ABL53627_ 
aal 


Homo sapiens 


GENO- Breast protein-eukaryotic 
conserved gene I (BSTP-ECGl) 
cDNA. 


213 


92 


1507 


ABB75677 


Homo sapiens 


GENO- Breast protein-eukaryotic 
conserved gene l (BSTP-ECGl) 
protein. 


213 


92 


1507 


AAY99421 


Homo sapiens 


GETH Human PR01433 (UNQ738) 
amino acid sequence SEQ ID NO:292. 


213 ! 


92 


1508 


AAW15565 


Homo sapiens 


UYJO Human intracellular tyrosine 
kinase Tnkl -alpha. 


79 


29 


1508 


fi i233062 


Gallus g alius 


src downstream region 


78 


33 


1508 


gi!8376366 


Neurospora 
crassa 


related to ribosomal protein Sl5 
precursor (mitochondrial) 


72 


30 


1509 


gi|21 297482| 

gbpSAA096 

27.1| 


Anopheles 
gambiae str. 
PEST 


agCPl554l 


68 


36 


1510 


AAM41631 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6562. 


127 


37 l 


1510 


AAM39845 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2990. 


127 


37 


1510 


AAM79502 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3148. 


127 


37 


1511 


Ki21217669 


Mus musculus 


myosin IHA 


70 


28 


1511 


gi|21302393| 

gb|EAA145 

38.1| 


Anopheles 
gambiae str. 
PEST 


agCP8799 


71 


36 


1511 


gi|20822589| 
refjXP 1408 
54.l| 


Mus musculus 


similar to myosin HIA 


70 


28 


1512 


gi69H049 


Babesia bovis 


p9.62-like variant erythrocyte surface 
antigen- la 


82 


28 


1512 


gi69l!045 


Babesia bovis 


p9.6.2 variant erythrocyte surface 
antigen- la 


82 


28 


1512 


gi69H047 


Babesia bovis 


p8.4.l variant erythrocyte surface 
antigen-la 


81 


28 
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1513 


gil0174843 


Bacillus 
halodurans 


maltose transport system (permease) 


77 


25 


1513 


gi56312 


Rattus 
norvegicus 


Gephyrin 


76 


31 


1513 


gi4325371 


Arabidopsis 
thaliana 


contains similarity to Medicago 
truncatula N7 protein (GB:Y17613) 


74 


28 


1514 


AAY14196 


Homo sapiens 


TAKE/ T cell receptor zeta chain 
protein sequence. 


95 


100 


1514 


gi623042 


Homo sapiens 


T-cell receptor zeta chain 


95 


100 


1514 


gi4960202 


Sus scrofa 


CD3 zeta chain 


95 


100 


1515 


ABB07508 


Homo sapiens 


INCY- Human aminoacyl tRNA 
synthetase (ATRS) polypeptide (ID: 
7474756CD1). 


726 


100 


1515 


AAB43670 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO:l 1 15. 


604 


82 


1515 


g il464742 


Homo sapiens 


threonyl-tRNA synthetase 


604 


82 


1516 


gi2 1109348 


Xanthomonas 
axonopodis pv. 
citri str. 306 


cytochrome B561 


77 


29 


1516 


gi21114046 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


cytochrome B561 


76 


28 


1516 


gi|21243760| 
ref|NP_6433 
42.1| 


Xanthomonas 
axonopodis pv. 
citri str. 306 


cytochrome B561 


77 


29 


1517 


ABB11450 


Homo sapiens 


HYSE- Human neurotoxin homologue, 
SEQ ID NO: 1820. 


119 


33 


1517 


gi8809770 


Mus musculus 


Ly-6I.l 


94 


30 


1517 


gi8809768 


Mus musculus 


lymphocyte antigen LY6I precursor 


94 


30 


1519 


gi|59977|em 
b|CAA7866 
2.11 


Human 

endogenous 

retrovirus 


tripartite fusion transcript PLA2L 


171 


67 


1519 


gi|17826947| 
dbj)BAB792 
87.1| 


Pseudomonas sp. 
ND137 


beta-l,4-xylanase 


73 


34 


1519 


gi|21232680| 
refJNP_6385 
97.1| 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


ribonuclease PH 


72 


30 


1520 


AAM78023 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 38329. 


190 


100 


1520 


AAM65326 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 37431. 


190 


100 


i on 


gllo44 /4oo 


iimencella 
nidulans 


rHl/rriz protein homo log 


121 


49 


1522 


AAG81417 


Homo sapiens 


ZYMO Human AFP protein sequence 
SEQIDNO:352. 


287 


100 


1523 


AAY90349 


Homo sapiens 


SMIK Human fatty acid synthase 
(FAS) protein sequence. 


158 


85 


1523 


AAB43871 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1316. 


158 


85 
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1523 


gi915392 


Homo sapiens 


fatty acid synthase ] 


158 


85 


1525 


AAG03819 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7900. 


93 


1 C\f\ 


1525 


Kil311466 


Homo sapiens 


24-kDa subunit of Complex I 


93 


100 


1525 


gil88852 


Homo sapiens 


NADH-ubiquinone reductase 


93 


100 


1526 


AAD02855_ 
aal 


Homo sapiens 


SUKA Human platelet membrane 
glycoprotein VI (GPVI) cDNA. 


73 


31 


1526 


AAB49403 


Homo sapiens 


MERE Human glycoprotein VI mature 
protein. 


73 


31 ! 


1526 


AAB61257 


Homo sapiens 


MILL- Mature human TANGO 268 
protein. 


73 


31 


1527 


gil7864896 


Mus musculus 


protocadherin 18 precursor 


81 


31 


1527 


gil5980222 


Yersinia pestis 


aconitate hydratase 1 


79 


30 


1527 


gi 12248353 


Fasciola hepatica 


NADH dehydrogenase subunit 5 


75 


56 


1528 


gi2440214 


Trypanosoma 
brucei brucei 


invariant surface glycoprotein 100 


83 


28 


1528 


gil0567463 


Rhizobium 
rhizogenes • 


probable virBl gene 


78 


22 


1529 


gi2231279 


Porcine 

reproductive and 
respiratory 
syndrome virus 


envelope protein 


66 


31 


1530 


gi|199851|gb 
[AAA39757. 
11 


Mus musculus 


pol protein 


257 


42 


1530 


gi|1498648|g 

b)AAB0645 

0.1! 


Mus musculus 


Gag-Pol polyprotein 


257 


42 


1530 


gi|331995|gb 
IAAB03091. 
11 


AKV murine 
leukemia virus 


gag-pol polyprotein (tag amber codon 
at 2250-2252 inserts Gin in Mo-MuLV) 


257 


42 


1533 


gi435698 


Homo sapiens 


CD44SP 


136 


100 


1533 


AAV63461_ 
aal 


Homo sapiens 


GEHO Human CD44 antigen cDNA. 


130 


100 


1533 


AAT14724_ 
aal 


Homo sapiens 


GEHO Human haematopoietic CD44 
cDNA clone CD44.5. 


130 


100 


1534 


gi2622165 


Methanotbermob 
acter 

thermautotrophic 
usstr. Delta H 


acetyltransferase 


71 


29 


1534 


gi|15679078| 
ref]NP 2761 
95.1| 


Methanothermob 
acter 

thermautotrophic 
us 


acetyltransferase 


71 


29 


1535 


gi7777 


Drosophila 
melanogaster 


protein H 


73 


28 


1535 


gi457146 


Plasmodium 
yoelii 


rhoptry protein 


to 


•JO 


1535 


gil3195258 


Plasmodium 
yoelii yoelii 


235 kDa rhoptry protein 


73 


38 


1536 


ABB0974O 


Homo sapiens 


BODE- Amino acid sequence of human 
protein phosphatase 1 1 .66. 


132 


43 


1536 


gil20830386| 
reflXP 1456 


Mus musculus 


similar to importin alpha lb 


72 


35 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


0/ 

/o 

Identity 




42.1j_ 










1537 


gil4039907 


Rattus 
norvegicus 


cytochrome P450 monooxygenase 
CYP2T1 | 




jy 


1537 


R i2920650 


Mus musculus 


cytochrome P450 CYP2B19 


2. fj 


AA 


1537 


gi2353336 


Capra hircus 


cytochrome P450 ! 


271 


31 


1538 


AAU83175 


Homo sapiens 


ZYMO Novel secreted protein 
Z874015G4P. 


282 




1538 


gi6714803 


Streptomyces 
coelicolor A3(2) 


integral membrane protein. 


77 




1539 


gil2963397 


Prunus x 
yedoensis 


ribulose- 1 ,5-bisphosphate 
carboxylase/oxygenase large subunit 


74 


32 


1539 


gi466436 


Saccharomyces 
cerevisiae 


BOI1 


69 


31 


1539 


gi5833897 


Besleria affinis 


ribulose 1,5-bisphosphate carboxylase 
large subunit 


69 


31 


1542 


AAY32193 


Homo sapiens 


INCY- Human receptor molecule 
(REC) encoded by Incyte clone 
044150. 


73 


26 


1542 


gi7576677 


Helicobacter 
pylori 


IceAl 


72 


44 


1542 


gi|20841498| 
ref|XP 1315 
41.1| 


Mus musculus 


similar to MUF1 protein 


73 


26 


1546 


gil4581448 


Homo sapiens 


FSHD Region Gene 2 protein 


73 


42 


1546 


gil 5982852 


Arabidopsis 
thaliana 


AT5g66850/MUD21_ll 


71 


34 


1546 


gi|14581448| 
gb|AAK219 
77.1| 


Homo sapiens 


FSHD Region Gene 2 protein 


73 


42 


1547 


gil8676660 


Homo sapiens 


FO00229 protein 


192 


92 


1547 


AAU21409 


Homo sapiens 


HUMA- Human novel foetal antigen, 
SEOIDNO 1653. 


179 


100 


1547 


AAM42128 . 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 7059. 


114 


53 


1548 


AAG64494 


Homo sapiens 


SHAN- Human natriuretic peptide 
receptor 18. 


539 


100 


1548 


gil 86767 10 


Homo sapiens 


FU00254 protein 


268 


77 


1548 


AAB28764 


Homo sapiens 


HUMA- Sequence homologous to 
protein fragment encoded by gene 21 . 


249 


72 


1549 


AAB67055 


Homo sapiens 


INCY- Human immune response 
molecule (IMUN) protein SEQ ID NO: 
9. 


606 


82 


1549 


AAO01862 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15754. 


404 


72 


1549 


gi|6753924|r 
eflNP_0343 
74. 1| 


Mus musculus 


Friend virus susceptibility 1 


213 


36 


1550 


gil90l29 


Homo sapiens 


70kDa peroxisomal membrane protein 


92 


100 


1550 


gi825711 


Homo sapiens 


70kD peroxisomal integral membrane 
protein 


92 


100 


1550 


gi220862 


Rattus 
norvegicus 


PMP70 


89 


94 


1551 


AAM69543 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 


228 


100 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 








ID NO: 29849. 






1551 


AAM57148 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29253. 


228 


100 


1551 


AAB93944 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 13960. 


94 


57 


1552 


gi4884924 


Rangiferine 
herpesvirus 1 


glycoprotein C 


75 


34 


1552 


gi|18556240| 
ref|XP 0676 
28.2| 


Homo sapiens 


similar to Salivary glue protein SGS-3 
precursor 


78 


30 


1552 


gi]4884924|g 

b|AAD3187 

6.1| 


Rangiferine 
herpesvirus 1 


glycoprotein C 


75 


34 


1553 


gi|2193870|d 
bj|BAA2041 
9.1| 


Mus musculus 


reverse transcriptase 


176 


35 


1553 


gi|2731767|g 

b|AAC5354 

2.11 


Mus musculus 


endonuclease/reverse transcriptase 


176 


35 


1554 


ABB08776 


Homo sapiens 


BODE- Human neuregulin 55 SEQ ID 
NO 2. 


75 


29 


1554 


AAM92816 


Homo sapiens 


HUMA- Human digestive system 
antigen SEQ ID NO: 2165. 


71 


29 


1554 


gi|6322838|r 
efJNP 0129 
11.11 


Saccharomyces 
cerevisiae 


Protein required for cell viability; 
Ykl014cp 


70 


27 


1555 


gi7528184 


Drosophila 
melanogaster 


bicoid-mteracting protein BIN3 


78 


28 


1555 


gil5292595 


Drosophila 
melanogaster 


SD09926p 


78 


28 


1555 


gi45 14620 


Mus musculus 


Ror2 


71 


24 


1557 


ABA91504_ 
aal 


Homo sapiens 


EYEE- Human epidermal growth factor 
receptor precursor cDNA. 


144 


93 


1557 


AAF85332_ 
aal 


Homo sapiens 


NOVS Nucleotide sequence of wild 
type EGFR1. 


144 


93 


1557 


AAM50768 


Homo sapiens 


EYEE- Human epidermal growth factor 
receptor precursor. 


144 


93 


1558 


AAB99950 


Homo sapiens 


SHAN- Human alkylated-DNA-protein 
cysteine methyltransferase 14. 


221 


100 


1558 


AAU16267 


Homo sapiens 


HUMA- Human novel secreted protein, 
Seq ID 1220. 


221 


100 


1558 


ABB11507 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO: 1877. 


183 


97 


1559 


Eil4599730 


Spachea correae 


maturase 


71 


28 


1559 


gil4599648 


Blepharandra 
heteropetala 


maturase 


71 


30 


1559 


gil4599673 


Galphimia 
gracilis 


maturase 


70 


28 


1560 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


340 


83 


1560 


£i|13310191| 


multiple 


recombinant envelope protein 


260 


70 
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ID 
NO: 


Accession 
No. 


Species 


Description 




Score 


% 
Identity 




gb[AAK181 
89.1|AF331 
500_1 


sclerosis 
associated 
retrovirus 
element 








1560 


gi|21 103962| 
gb|AAM331 
41.1| 


Homo sapiens 


enverin-2 


248 


84 


1561 


AAB94698 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 15680. 


107 


95 


1561 


AAU18480 


Homo sapiens 


HUMA- Human endocrine polypeptide 
SEQ ID No 435. 


107 


95 


1561 


ABB10288 


Homo sapiens 


HUMA- Human cDNA SEQ ID NO: 
596. 


107 


95 


1562 


gi969078 


Drosophila 
melanogaster 


S-adenosylhomocysteine hydrolase 


73 


26 


1562 


gi21064553 


Drosophila 
melanogaster 


RE58316p 


73 


26 


1562 


AAM41205 


Homo sapiens 


HYSE- Human polypeptide SEQ ED 
NO 6136. 


72 


30 


1563 


gi 1778844 


Dictyostelium 
discoideum 


LimA 


71 


34 


1563 


gi|20985456| 
ref|XP 1421 
11.11 


Mus mus cuius 


similar to actin beta chain - human 


75 


36 


1563 


gi|1778844|g 

b|AAB4092 

9.11 


Dictyostelium 
discoideum 


LimA 


71 


34 


1564 


gi(9507757(r 
efJNP 0614 
23.1| 


PlasmidF 


resolvase 


507 


91 


1564 


gi|148589|gb 
(AAA24900. 

11 


PlasmidF 


Protein D 


507 


91 


1564 


gi|10955295| 
refjNP 0526 
36.1| 


Escherichia coli 


resolvase 


501 


90 


1565 


gi7649370 


Arabidopsis 
thaliana 


guanine nucleo tide-exchange-like 
protein 


77 


38 


1565 


gil674160 


Mycoplasma 
pneumoniae 


involved in cytadherence, see: 
MPN142 


71 


35 


1565 


gi|15229258| 
ref]NP 1899 
16.1| 


Arabidopsis 
thaliana 


guanine nucleotide-exchange - like 
protein 


77 


38 


1566 


gil799600 


SwissProt 
Accession 
Number P3 1458 


similar to 


1051 


99 


1566 


gil3814506 


Sulfolobus 
solfataricus 


Mandelate racemase /muconate 
lactonizing enzyme related protein 
(MR/MLE) 


2oo 


JO 


1566 


gil0640034 


Thermoplasma 
acidophilum 


starvation-sensing protein rspA related 
protein 


270 


35 


1567 


gil3359972 


Escherichia coli 
0157:H7 


acridine efflux pump 


573 


98 


1567 


gil773144 


Escherichia coli 


probable transmembrane protein AcrE 


573 


98 
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Accession 
No. 
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% 
Identity 


1567 


&532311 


Escherichia coli 


114 kDa protein 


573 


98 


1569 


gi8918871 


YccA of plasmid 
ColIb-P9] 
[Plasmid F 


96 pet identical to gp:AB021078_30 


288 


98 


1569 


gi|17136976| 
refJNP 4770 
26,1| 


Drosophila 
melanogaster 


repo-Pl; Antibody RK2 


71 


33 


1569 


gi|6502544|g 
blAAF14351 
.1|AF11019 
8 1 


Glomus 
intraradices 


homeobox protein HB 1 


70 


31 


1570 


gil3363792 


Escherichia coli 
0157:H7 


zinc-transporting ATPase 


410 


87 


1570 


gi466605 


Escherichia coh 


No definition line found 


410 


87 


1570 


gil2518128 


Escherichia coli 

0157:H7 

EDL933 


zinc-transporting ATPase 


410 


87 


1571 


AAU83186 


Homo sapiens 


ZYMO Novel secreted protein 
Z887014G7P. 


1006 


100 


1571 


gi7248459 


Zea mays 


arabinogalactan protein 


85 


29 


1571 


gi35 13742 


Arabidopsis 
thaliana 


contains similarity to Zea mays 
embryogenesis transmembrane protein 
(GB:X97570) 


82 


35 


1572 


gil2597465 


Caenorhabditis 
elegans 


CED-1 


72 


44 


1572 


gil9571666 


Caenorhabditis 
elegans 


similar to EGF-like domain 


72 


44 


1572 


gi4883938 


Drosophila 
melanogaster 


laminin alpha 1,2 


67 


31 


1573 


ABB12490 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


106 


38 


1574 


fi il478205 


Mus musculus 


PNG protein 


75 


41 


1574 


AAM40148 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3293. 


69 


56 


1574 


AAM79341 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
2987. 


69 


35 


1576 


gi|20882651| 
refjXP 1233 
03.1| 


Mus musculus 


ATPase, class 2, member b 


234 


91 


1576 


gi|7656918ir 
efJNP 0566 
20.1| 


Mus musculus 


ATPase, class 2, member b; ATPase 
9B, class II; ATPase 9B, p type 


234 


91 


1577 


gil8143418 


Alteromonas sp. 
0-7 


chitinase A 


77 


39 


1577 


gil5426l05 


Leishmania 
major 


probable surface antigen protein 


75 


24 


1578 


gil 9702241 


Homo sapiens 


rabconnectin 




if j 


1578 


gi7452946 


Homo sapiens 


X-like 1 protein 


132 


41 


1578 


gil279384 


Drosophila 
melanogaster 


X 


109 


29 


1580 


AAE20337 


Homo sapiens 


HUMA- Human B7-H1 1 protein 
mature extracellular domain. 


122 


23 


1580 


AAE20336 


Homo sapiens 


HUMA- Human B7-H1 1 protein 
extracellular domain. 


122 


23 
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% 
Identity 


1580 


gi2062702 


Homo sapiens 


butyrophilin 


122 


23 


1581 


AAE18640 


Homo sapiens 


INCY- Human G-protein coupled 
receptor (GCREC-1). 


70 


35 


1581 


gil8369751 


Oryza sativa 


ethylene responsive protein 


70 


50 


1581 


gil5217292 


Oryza sativa] 
[Oryza sativa 
(japonica 
cultivar-group) 


Putative AP2 domain containing 
protein 


70 


50 


1583 


gi6468047 


Homo sapiens 


Kruppel-like factor 


85 


73 


1583 


gi5916096 


Homo sapiens 


Kruppel-like factor LKLF 


85 


73 


1583 


gi4583418 


Homo sapiens 


Kruppel-like zinc finger transcription 
factor 


85 


73 


1585 


gi2570021 


Homo sapiens 


paired box containing transcription 
factor 


77 


-37 


1585 


gi3 115988 


Homo sapiens 


dJ394P2-l.l (PAX-7) 


77 


37 


1585 


gi2570015 


Homo sapiens 


alternative 


77 


37 


1586 


gi7861533 


Rattus 
norvegicus 


retina specific protein PAL 


72 


43 


1586 


gi20977028 


Xenopus laevis 


mitotic phosphoprotein 39 


72 


34 


1586 


AAB58458 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 796. 


68 


39 


1587 


gi5901864 


Drosophila 
melanogaster 


BcDNA.LD27873 


81 


24 


1587 


gil 54585 14 


Streptococcus 
pneumoniae R6 


Pneumococcal histidine triad protein D 
precursor 


78 


27 


1587 


gi5042400 


Homo sapiens 


NFI-X3=transcription factor [AA 


75 


30 


1592 


gi4210501 


Homo sapiens 


BC85722 1 


253 


61 


1592 


gil4794910 


Homo sapiens 


capicua protein 


253 


61 


1592 


gil4794914 


Mus musculus 


capicua protein 


253 


61 


1593 


gi|8131854|g 
b|AAF73108 
.11AF14795 
6 1 


Trypanosoma 
cruzi 


antigen JL8 


69 


34 


1595 


gil 8892729 


Pyrococcus 
furiosus DSM 
3638 


3-hydroxyisobutyrate dehydrogenase 


70 


27 


1595 


gi|20847046| 
refpflP 1366 
21.1| 


Mus musculus 


similar to Transcription factor BTF3 
(RNA polymerase B transcription 
factor 3) 


70 


28 


1595 


gi|18977088| 
ref]NP 5784 
45.1| 


Pyrococcus 
furiosus DSM 
3638 


3-hydroxyisobutyrate dehydrogenase 


70 


27 


1597 


AAU83621 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
60. 


151 


42 


1597 


AAOQ5826 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19718. 


146 


83 


1597 


AAM41346 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6277. 


102 


46 


1598 


AAM79503 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3149. 


80 


35 


1598 


AAM78519 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1181. 


80 


35 


1598 


©18676526 


Homo sapiens 


FLJ00160 protein 


80 


35 


1599 


gi2149640 


Arabidopsis 


Argonaute protein 


72 


33 
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tha liana 








1599 


gil5027491 


respiratory 
syncytial vims 


glycoprotein 


71 


32 


1599 


gi|15221177| 
ref]NP 1752 
74.1| 


Arabidopsis 
tfaaliana 


leaf development protein Argonaute 


72 


33 


1601 


gil7130010 


Nostoc sp. PCC 
7120 


WD-40 repeat protein 


136 


Zo 


1601 


gil653631 


Synechocystis 
sp. PCC 6803 


beta transducin-like protein 


131 


lt> 


1601 


gil7135261 


Nostoc sp. PCC 
7120 


WD-40 repeat protein \ 


115 


27 


1602 


gil 103853 


Rattus 
norvegicus 


rHAPl-A 


89 


33 


1602 


gill03851 


Rattus 
norvegicus 


huntingtin associated protein 


89 


33 


1602 


gil4579673 


TaJafugu 
rubripes 


pericentriolar material 1 protein 


87 


30 


1603 


gi537446 


Arabidopsis 

thflltana 


AtHSPlOl 


75 


31 


1603 


gil2324908 


Arabidopsis 
tfaaliana 


heat shock protein 101; 13093-16240 


75 


31 


1603 


gi6715468 


Arabidopsis 
thaliana 


heat shock protein 101 


75 


31 


1604 


gi2 190531 


Vibrio cholerae 


methyl accepting chemotaxis protein 


71 


26 


1604 


_gi9657614 


Vibrio cholerae 


hemolysin secretion protein HylB 


71 


26 J 


1604 


gi9655306 


Vibrio cholerae 


heat shock protein GxpE 


70 


35 


1605 


gi3912936 


Geobacillus 

stearothermophil 

us 


ornithine carbamoyltransferase 


68 


31 


1606 


gi8797 


Drosophila 
melanogaster 


CYS3HIS finger protein 


678 


51 


1606 


gil5291975 


Drosophila 
melanogaster 


LD33756p 


617 


65 


1606 


gi6967181 


Homo sapiens 


c399E4.1 (similar to D.melanogaster 
unkempt protein.) 


549 


75 


1607 


gi|21301783| 

gb|EAA139 

28.11 


Anopheles 
gambiae str. 
PEST 


agCP8730 


72 


35 


1607 


gi|21361276| 
ref]NP 0060 
75.2| 


Homo sapiens 


mterferon-stimulated transcription 
factor 3, gamma (48kD); mterferon- 
stimulated gene factor 3, gamma 
subunit (48 kD) 


68 


29 


1609 


gi2661094 


Spinacia 
oleracea 


cold acclimation protein 


76 


32 


1612 


gi|1780975|e 
mb|CAA714 
18.1| 


Human 
endogenous 
retrovirus K 


gag protein 


312 


34 


1612 


gi|58028l0|g 

b|AAD5179 

1.1| 


Homo sapiens 


Gag-Pro-Pol protein 


309 


34 


1612 


gi|887448|e 

mb|CAA513 

06.1| 


Human 

endogenous 

retrovirus 


gag 


309 


34 
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Species 
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0/ 

Identity 


1613 


AA013889 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 27781. 


73 


42 


1614 


gill 065727 


Homo sapiens 


dJ493F7.1 (similar to murine BET3) 


347 


100 


1614 


gi2791806 


Mus musculus 


bet3 


253 


69 


1614 


gil3277654 


Mus musculus 


Bet3 homolog (S. cerevisiae) 


253 


69 


1615 


gil 122901 


Saccharomyces 
cerevisiae 


MSP 8 


77 


20 


1615 


gi825546 


Saccharomyces 
cerevisiae 


Cat8p 


77 


20 


1615 


gil7978563 


Xenopus laevis 


Sp 1-like zinc-finger protein XSPR- 1 


75 


40 


1616 


AAY02536 


Homo sapiens 


ICOS- Human ICAM-6 protein 
sequence. 


458 


98 


1616 . 


gil2248907 


Homo sapiens 


TCAM-1 


458 


98 


1616 


gi4579740 


Rattus 
norvegicus 


testicular cell adhesion molecule 1 
(TCAMl) 


366 


76 


1617 


AAM67067 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27373. 


271 


64 


1617 


AAM54664 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26769. 


271 


64 


1617 


AAM56747 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28852. 


229 


69 


1618 


gi5802814 


Homo sapiens 


Gag-Pro-Pol-Env protein 


532 


52 


1618 


gil780973 


Human 
endogenous 
retrovirus K 


pol protein 


531 


52 


1618 


gi5802821 


Homo sapiens 


Gag-Pro-Pol protein 


531 


52 


1619 


gi2769587 


Mus musculus 


STOP protein 


662 


86 


1619 


gil370291 


Rattus 
norvegicus 


STOP protein 


662 


92 


1619 


gi3287265 


Rattus 
norvegicus 


E-STOP protein 


662 


92 


1620 


AAM65980 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26286. 


266 


100 


1620 


AAM53601 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25706. 


266 


100 


1620 


gi|20270271| 
ref!NP_6200 
82.1| 


Mus musculus 


RIKEN cDNA 1 190017O12 


198 


80 


1621 


gil 1862941 


Mus musculus 


DDM36E 


74 


33 


1621 


gil 1862939 


Mus musculus 


DDM36 


74 


33 


1621 


gi7650186 


Mus musculus 


neighbor of Punc el 1 protein 


73 


33 


1622 


gi3 157464 


Thermus sp. A4 


integral membrane protein 


74 


TO 
JO 


1623 


gi|59977|em 
b|CAA7866 
2.11 


Human 

endogenous 

retrovirus 


tripartite fusion transcript PLA2L 


129 


82 


1623 


gi|20161147| 
dbj|BAB900 
75.1| 


Oryza sativa 

(japonica 

cultivar-group) 


VsaA -like protein 


88 


32 


1623 


gi| 17864474] 


Drosophila 


domino 


87 


41 
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No. 
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% 
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ref)NP 5248 
33.11 


melanog aster 








1626 


AAO00498 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 14390. 


99 


43 


1627 


gil4041733 


Xenorhabdus 
nematophila 


XptA2 protein 


70 


23 


1627 


gi|15641593| 
refJNP 2312 
25.1| 


Vibrio cholerae 


catalase 


69 


23 


1628 


gil9888204 


Methanopyrus 
kandleri AVI 9 


Site-specific DNA methylase 


80 


27 


1628 


gi6358691 


Simian 

immunodeficienc 
y virus 


Pol protein 


78 


32 j 


1628 


gi|20094956| 
refJNP 6148 
03.1| 


Methanopyrus 
kandleri AV19 


Site-specific DNA methylase 


80 


27 


1629 | 


AAB07704 


Homo sapiens 


INMR Protein encoded by the 
endogenetic fragment of HERV-W. 


594 


67 


1629 


gi8272464 


Homo sapiens 


gag 


594 


67 


1629 


AAB07703 


Homo sapiens 


INMR Protein encoded by the 
endogenetic fragment of HERV-W. 


590 


66 


1630 


gi32498 


Homo sapiens 


precursor (AA -23 to 476) 


145 


100 


1630 


gi339595 


Homo sapiens 


triglyceride lipase precursor 


145 


100 


1630 


gi386859 


Homo sapiens 


hepatic lipase 


145 


100 


1631 


gi8777465 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


703 


77 


1631 


gil7019507 


Tripneustes 
gratilla 


dynein heavy chain isotype IB 


505 


53 


1631 


AAB93815 


Homo sapiens 


HEU- Human protein sequence SEQ 
IDNO:13606. 


457 


71 


1632 


AAM68837 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29143. 


122 


48 


1632 


AAM56460 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28565. 


122 


48 


1632 


gil7861826 


Drosopbila 
melanogaster 


GM01964p 


90 


51 


1633 


gi|21300783| 

gb|EAA129 

28.1| 


Anopheles 
gambiae str. 
PEST 


ebiP1105 


77 


33 


1633 


gi|19880523| 
gb|AAM003 
72.11AF368 
053 1 


Bactrocera 
dorsalis 


vitellogenin 1 precursor 


68 


27 


1633 


gi|2 1070999) 
ref]NP 0659 
11.1| 


Homo sapiens 


stromal interaction molecule 2 
precursor 


68 


39 


1637 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


289 


91 


1637 


gil2 11039621 


Homo sapiens 


enverin-2 


261 


82 
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No. 
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% 
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gb|AAM331 
41.1| 










1637 


gi|13310191| 
gb|AAK181 
89.1|AF331 
500_1 


multiple 

sclerosis 

associated 

retrovirus 

element 


recombinant envelope protein 


259 


82 


1638 


AAR58809 


Homo sapiens 


UYNY Human RPTP-gamma. 


86 


26 


1638 


gi292411 


Homo sapiens 


receptor-type protein tyrosine 
phosphatase gamma 


86 


26 


1638 


gil263069 


Homo sapiens 


receptor tyrosine phosphatase gamma 


86 


26 


1639 


gi9857054 


Leishmania 
major 


possible CG7055 protein 


74 


27 


1639 


gi|20853034| 
refpCP 1259 
62.1| 


Mus musculus 


expressed sequence AI447519 


73 


35 


1639 


gi|7008003|d 
bj[BAA9087 
4.11 


Mus musculus 


transcription factor MAZR 


73 


35 


1640 


AAG03810 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7891. 


220 


95 


1640 


gil 86800 


Homo sapiens 


ribosomal protein LI 2 


220 


95 


1640 


gi57680 


Rattus rattus 


ribosomal protein L12 


220 


95 


1641 


AAB44286 


Homo sapiens 


GETH Human PRO1072 (UNQ529) 
protein sequence SEQ ID NO:303. 


1709 


100 


1641 


AAY41730 


Homo sapiens 


GETH Human PRO1072 protein 
sequence. 


1709 


100 


1641 


gil4602625 


Homo sapiens 


PAN2 protein 


1709 


100 


1642 


gi20147241 


Arabidopsis 
thaliana 


AT5g09850/MYH9j5 


74 


32 


1642 


gil4329782 


Homo sapiens 


dJ1121G12.3 (Novel gene) 


72 


28 


1642 


gi|16648730| 

gb|AAL255 

57.1| 


Arabidopsis 
thaliana 


AT5g09850/MYH9J> 


74 


32 


1643 


gi2952340 


Rattus 
norvegicus 


insulin receptor substrate 2 


89 


31 


1643 


gi2653351 


Bovine 

herpesvirus type 
1.1 


product of latency-related gene 


83 


30 


1643 


gi4511969 


Homo sapiens 


insulin receptor substrate-2 


82 


26 


1644 


gi9964099 


Chlamydia 
trachomatis 


inclusion membrane protein 


73 


35 


1644 


gil9171028 


Encephalitozoon 
cuniculi 


ATP DEPENDENT DNA BINDING 
HELICASE (RAD3/XPD 
SUBFAMILY OF HELICASES) 


67 


29 


1644 


gi|9964095|g 
b|AAG0982 
1.1|AF2793 
62 1 


Chlamydia 
trachomatis 


inclusion membrane protein 


73 


35 


1646 


gi|10863995| 
reflNP 0670 
11.1| 


Homo sapiens 


clones 23667 and 23775 zinc finger 
protein 


67 


42 


1647 


gil 196425 


Homo sapiens 


envelope protein 


93 


39 


1647 


gi200296 


Mus musculus 


perlecan 


85 


26 
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No. 
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Score 


/o 

Identity 


1647 


gi8l31894 


Homo sapiens 


mitofilin 


QA 




1648 


gil573040 


Haemophilus 
influenzae Rd 


aspartokinase I / homoserine 
dehydrogenase I (tnrA) 


73 


36 


1648 


gi8778726 


Arabidopsis 

thali'ana 


T25N20.14 


73 


31 ! 


1648 


gi|16272063| 
reflNP 4382 
62.1| 


Haemophilus 
influenzae Rd 


aspartokinase I / homoserine 
dehydrogenase I (thrA) 


73 


36 


1649 


gi295642 


Saccharomyces 
cerevisiae 


phospholipase C 


79 


36 


1649 


gi7548846 


Saccharomyces 
cerevisiae 


delta class phosphoinosiude-specific 
phospholipase C homolog 


77 


iO 


1649 


gil61104 


Schistosoma 
mansoni 


engrailed-like homeodomain protein 


-7 A 

74 




1651 


gi|13129464| 
gb|AAK131 
22.1|AC080 
019 14 


Oryza sativa] 
[Oryza sativa 
(japonica 
cultivar-group) 


Polyprotein 


en 
00 


AC\ 

4U 


1652 


AAG81446 


Homo sapiens 


ZYMO Human AFP protein sequence 
SEQ ID NO:410. 




1 AA 


1652 


gil8032212 


Homo sapiens 


histone acetyltransferase MOZ2 


89 


34 


1652 


AAR34936 


Homo sapiens 


UYJO CENP-B. 


77 


35 


1653 


Ri20 145484 


Bos taurus 


SCO-spondin 


71 


29 


1655 


AAM86382 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 13975. 


129 


55 


1655 


ABB03887 


Homo sapiens 


HUMA- Human musculoskeletal 
system related polypeptide SEQ ID NO 
1834. 


118 


62 


1655 


AAM75964 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36270. 


85 


56 


1659 


gi38035 


Homo sapiens 


p25 protein 


110 


45 


1659 


gi330915 


Equine 
herpesvirus 1 


IR4 protein 


99 


28 


1659 


gil56606 


Chironomus 
tentans 


Spld 


84 


30 


1660 


gi9654641 


Vibrio cholerae 


3-deoxy-D-manno-octulosonic-acid 
transferase 


84 


23 


1660 


gi|20835446| 
ref]XP_1444 
09. 1| 


Mus musculus 


similar to STARP antigen 


15 


ZD 


1660 


gi|15596880| 
ref|NP_2503 
74.1| 


Pseudomonas 
aeruginosa 


probable sugar aldolase 


72 


26 


1661 


gi4062318 


Escherichia coli 


Heat-responsive regulatory protein 


79 


36 


tool 


giy / OUZD 


Escnencnia con 


HlbrV 


79 


36 


1661 


gil786951 


Escherichia coli 
K12 


protein modification enzyme, induction 
ofompC 


79 


36 


1662 


AAM68588 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 28894. 


155 


100 


1662 


AAM56212 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 


155 


100 
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% 
Identity 








NO: 28317. 






1662 


gi3845169 


Plasmodium 
falciparum 3D7 


phosphatase (acid phosphatase iamily) 


66 


52 


1663 


AAG89215 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 335. 


218 


1 AA 

100 


1663 


gi20070921 


Mus musculus 


RIKEN cDNA 2410008M22 gene 


130 


55 


1663 


AAR77602 


Homo sapiens 


FORS/ Human circulating cytokine 
CC-1 C-terrninal fragment. 


92 


44 


1664 


AAE18212 


Homo sapiens 


CURA- Human MOM protein. 


75 


47 


1664 


AAM00966 


Homo sapiens 


HYSE- Human bone marrow protein, 
SEQ ID NO: 442. 


72 


35 


1665 


AAB92828 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 11365. 


74 


93 


1665 


AAG63852 


Homo sapiens 


INCY- Amino acid sequence of human 
GTPase activating protein GTPAP2. 


74 


93 


1665 


AAG63851 


Homo sapiens 


INCY- Amino acid sequence of human 
GTPase activating protein GTPAPL 


74 


93 


1666 


AAM72897 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 33203. 


135 


65 


1666 


AAM60268 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 32373. 


135 


65 


1666 


gi4007097 


Homo sapiens 


dJl 1 18D24.2 (60S Ribosomal Protein 
L10 LIKE) 


135 


65 


1667 


gi212267 


Gallus gallus 


cartilage link protein 


917 


49 


1667 


gi2010 


Sus scrofa 


link protein precursor (AA -15 to 339) 


913 


51 


1667 


gi459439 


Equus caballus 


link protein 


910 • 


51 


1668 


gil0443237 


Mus musculus 


splicing factor 3a, subunit 2 


276 


36 


1668 


gi396743 


Podocoryne 
carnea 


Pod-EPPT 


276 


30 


1668 


gi294131 


Plasmodium 
falciparum 


circumsporozoite protein 


266 


22 


1669 


AAM49641 


Homo sapiens 


BOEH Human tumour-associated 
antigen B345 protein SEQ IB NO 4. 


132 


65 


1669 


AAU12252 


Homo sapiens 


GETH Human PR05773 polypeptide 
sequence. 


132 


65 


1669 


AAY91592 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 6 SEQ ID 
NO:265. 


132 


65 


1670 


gi4835383 


Homo sapiens 


alias DLC1 


226 


47 


1670 


gi4704343 


Homo sapiens 


alias DLC1; candidate tumor 
suppressor gene 


226 


47 


1670 


gil55627 


Acanthamoeba 
castellanii 


myosin I heavy chain 


118 


42 


1671 


ABB 12490 


Homo sapiens 


HYSE- Human bone marrow expressed 

a " f"lT" , /"Y TT> VTA f\ f\ 

protein SEQ ID NO: 329. 


237 


88 


1671 


gi6002932 


Strep tomyces 
fradiae 


glycosyl transferase 


67 


35 


1671 


gi|9634613|r 
efpSfP 0381 
50.1| 


Human 

papillomavirus 
type 69 


LI 


65 


39 


1672 


gil3938013 


Homo sapiens 


Similar to RIKEN cDNA 2610509G12 
gene 


333 


66 
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1672 


gi2388970 


Schizosaccharom 
yces pombe 


tat-binding homolog 7, AAA ATPase 
family protein 


235 


41 


1672 


gi6850321 


Aiabidopsis 
tlialiana 


Contains similarity to YTA7 ATPase 
gene from Saccharomyces cerevisiae 
gb|X81072, and contains Bromodomain 
PF|00439, AAA PF|00004, and Sigma- 
54 PF|00158 transcription factor 
domains. 


214 


40 


1673 ; 


gil 10661 13 


Drosophila 
meLanogaster 


Misexpression suppressor of ras 4 


71 


29 


1673 


gi|20829387| 
ref|XP 1295 
40.1| 


Mus musculus 


RIKEN cDNA 4930455F23 


77 


27 


1673 


gi|17647635| 
refJNP 5237 
75.1| 


Drosophila 
melanogaster 


Misexpression suppressor of ras 4 


71 


29 


1674 


gi|20535935| 
ref|XP 1157 
87.1| 


Homo sapiens 


similar to splicing coactivator subunit 
SRm300; RNA binding protein; AT- 
rich element binding factor 


75 


37 


1674 


gi|17544226| 
refpSIP 5001 
51.11 


Caenorhabditis 
elegans 


Y76B12C.4.P 


72 


34 


1674 


gi|17559826| 
reflNP 5057 
99.1| 


Caenorhabditis 
elegans 


sepB domain 


70 


26 


1675 


gi5708067 


Oryctolagus 
cuniculus 


hyperpolarization activated cation 
channel 


99 


27 


1675 


gi402558 


Canis familiaris 


mucin 


98 


27 


1675 


gil0636484 


Homo sapiens 


jjolyglutamme^ontamingjjrotein 


96 


26 


1676 


AAM95365 


Homo sapiens 


HUMA- Human reproductive system 
related antigen SEQ ID NO: 4023. 


73 


26 


1676 


AAB56709 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1287. 


72 


34 


1676 


gil881288 


Bacillus subtilis 


FUNCTION UNKNOWN, SIMILAR 
PRODUCT IN E.COLI, H. 
INFLUENZAE AND NEISSERIA 
MENINGITIDIS. 


71 


30 


1677 


gi|15892512| 
ref!NP_3602 
26.1| 


EC:2.7.7.41] 

[Rickettsia 

conorii 


phosphahdate cytidylyltransferase 


65 


34 


1679 


gil4231 


Saccharomyces 
cerevisiae 


. 

NADH dehydrogenase (ubiquinone) 


ID 




1679 


gi805022 


Saccharomyces 
cerevisiae 


Ndilp 


73 


31 


1679 


gil353352 


Chlamydomonas 
remhardtii 


alanine aminotransferase 


70 


27 


1£Qft 




Ij a CUIUS SUDUIIS 


surfactin production 


77 


36 


1680 


gi396482 


Bacillus subtilis 


srfA2 


77 


36 


1680 


gi5 16360 


Bacillus subtilis 


surfactin synthetase 


77 


36 


1681 


AAG64494 


Homo sapiens 


SHAN- Human natriuretic peptide 
receptor 1 8. 


156 


80 


1681 


AAE16275 


Homo sapiens 


INCY- Human kinase PKIN-21 
protein. 


154 


73 


1681 


AAM40599 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 1 54 


73 
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NO 5530. 






1682 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


1646 


75 


1682 


gi|2351212|d 
bj|BAA2206 
4.1| 


Friend murine 
leukemia virus 


gag-pol polyprotein (precursor protein) 


807 


40 


1682 


gi|9626961|r • 
eflNP 0579 
33.1| 


Murine leukemia 
virus 


Prl80 


802 


40 


1683 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


457 


53 


1683 


gi3033415 


Gibbon ape 
leukemia virus 


gag polyprotein 


353 


38 


1683 


gi|6524623|g 
b|AAF15097 
.11 


Phascolarctos 
cinereus 


gag protein 


343 


38 


1684 


gil9110438 


Homo sapiens 


polycystin-lLl 


712 


98 


1684 


gi6361629 


Periplaneta 
americana 


vitellogenin 


81 


25 


1684 


gi3 115393 


Rana pipiens 


guanylate cyclase inhibitory protein 


80 


35 


1686 


AAY91542 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 92 SEQ ID 
NO:215. 


212 


84 


1686 


gil279841 


Bos taurus 


glycine transporter 


72 


36 


1686 


Ril9879917 


Oryza sativa 


acid phosphatase 


70 


35 


1687 


gil2056568 


Homo sapiens 


MSTP063 


212 


88 


1687 


gil3539684 


Homo sapiens 


zinc finger protein 29 1 i 


212 


88 


1687 


gi|12056568| 
gb|AAG479 
45.1|AF119 
814 1 


Homo sapiens 


MSTP063 


212 


88 


1689 


gi5689766 


Homo sapiens 


zinc finger 2.2 


222 


91 


1689 


AAU16267 


Homo sapiens 


HUMA- Human novel secreted protein, 
Secf ID 1220. 


178 


58 


1689 


AAB99950 


Homo sapiens 


SHAN- Human alkylated-DNA-protein 
cysteine methyltransferase 14. 


177 ' 


60 


1690 


gi3328880 


Chlamydia 
trachomatis 


Protein Export 


73 


29 


1690 


gi2832232 


Brucella 

melitensis biovar 
Abortus 


flagellin; FliC 


67 


29 


1690 


gil7984285 


Brucella 
melitensis 


FLAGELUN 


67 


29 


1692 


gi4927443 


Haemophilus 
influenzae 


hemoglobin/hemoglobin-haptoglobin 
binding protein 


93 


80 


1692 


gi4204775 


Haemophilus 
influenzae 


hemoglobin and hemoglobin- 
haptoglobin binding protein 


93 


80 


1692 


gi3647226 


Haemophilus 
influenzae 


hemoglobin binding protein 


93 


80 


1694 


AAW95631 


Homo sapiens 


GEMY Homo sapiens secreted protein 
gene clone hj 968 2. 


102 


100 


1694 


gil3162186 


Homo sapiens 


calsyntenin-3 protein 


102 


100 
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1695 


AAO04205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 18097. 


81 


37 


1695 


gil 60180 


Plasmodium 
cynomolgi 


circumsporozoite antigen 


81 


29 


1695 


gi495522 


Plasmodium 
simiovale 


circumsporozoite protein 


80 


30 


1696 


AAM80223 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3869. 


252 


66 


1696 


AAM79239 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1901. 


252 


66 


1696 


gi3 688394 


Homo sapiens 


triple LIM domain protein 


252 


66 


1697 


gil9887715 


Methanopyrus 
kandleriAV19 


Predicted membrane protein 


74 


28 


1698 


AAM93184 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 2552. 


269 


87 


1698 


gil 8044066 


Mus musculus 


RIKEN cDNA 5033406L14 gene 


226 


76 


1698 


AAB95302 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17538. 


194 


78 


1699 


ABB17279 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 5936. 


110 


56 


1699 


AAO13013 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26905. 


101 


71 


1699 


gi|7650258|g 
b|AAF65960 
.1|AF20777 
0 1 


Hepatitis C virus 

* 


polyprotein 


74 


28 


1700 


gi!2697585 


Arabidopsis 

fhaliana 


4-(cytidine 5'-phospho)-2-C-methyl-D- 
erithritol kinase 


69 


40 


1701 


gil 6740569 


Homo sapiens 


Similar to thymus expressed gene 3 


84 


27 


1701 


gil7940760 


Mus musculus 


cask-interacting protein 2 


79 


26 


1701 


gil7940758 


Homo sapiens 


cask-interacting protein 1 


77 


26 


1702 


gil7385401 


Homo sapiens 


TPIP alpha lipid phosphatase 


234 


62 


1702 


AAU75783 


Homo sapiens 


INCY- Human protein phosphatase 1 
(PP1) protein sequence. 


208 


57 


1702 


AAG67638 


Homo sapiens 


HELI- Amino acid sequence of a 
human protein. 


202 


56 


1703 


AAO07887 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 21779. 


246 


85 


1703 


AAO08651 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22543. 


239 


83 


1703 


AAO08732 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22624. 


221 


80 


1704 


AAB94588 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 15392. 


82 


52 


1704 


gi3288914 


Mus musculus 


aortic carboxypeptidase-like protein 
ACLP 


82 


24 


1704 


AAM93437 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3074. 


O 1 

81 


52. 


1706 


AAM86104 


Homo sapiens 


HUMA- Human 

irnmune/haematopoietic antigen SEQ 
ID NO: 13697. 


179 


100 


1706 


gil0039425 


Equus caballus 


ALR protein 


120 


40 


1706 


gi20502826 


Eimeria maxima 


cGMP-dependent protein kinase 


115 


35 


1707 


AAM70251 


Homo sapiens 


MOLE- Human bone marrow 


115 


78 
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expressed probe encoded protein SEQ 
ID NO: 30557. 






1707 


AAM57834 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29939. 


115 


78 


1707 


gil5450860 


Arabidopsis 
thaliana 


serine/threonine-protein kinase Mak 
(male germ cell-associated kinase)-like 
protein 


71 


56 


1708 


gi 1620403 


Homo sapiens 


SFl-Bo isoform 


82 


41 


1708 


gil9072991 


Hypocrea virens 


class HI chitinase precursor 


82 


40 


1708 


gil 8765873 


Hypocrea virens 


class HI chitinase 


82 


40 


1709 


AAM52240 


Homo sapiens 


INCY- Human MFAP4 SEQ ID NO 3. 


1384 


100 


1709 


gi790817 


Homo sapiens 


microrlbril-associated glycoprotein 4 


1384 


100 


1709 


AAM52239 


Homo sapiens 


INCY- Human MAG4V SEQ ID NO 1. 


1374 


100 


1710 


gil6769882 


Drosophila 
melanogaster 


SD07884p 


67 


27 


1710 


gi|17545505| 
ref|NP 5189 
07.1| 


Ralstonia 
solanaceamm 


CONSERVED HYPOTHETICAL 
PROTEIN 


66 


41 


1711 


AAU82954 


Homo sapiens 


ANAD- Human homologue of MPT 1 
protein target for antifungal compound 


111 


27 


1711 


gi2058326 


Homo sapiens 


subunit of RNA polymerase II 
transcription factor TFI1D 


111 


27 


1711 


gil3559031 


Homo sapiens 


bAl 1M20.1 (TATA box binding j 
protein (TBP)-associated factor, RNA 
polymerase II, CI, 130kD) 


108 


26 


1712 


AAB65626 


Homo sapiens 


SUGE- Novel protein kinase, SEQ ID 
NO: 152. 


209 


82 


1712 


AAM25283 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO:798. 


209 


82 


1712 


AAU17269 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 834. 


176 


67 


1713 


gil8256065 


Mus mus cuius 


Similar to ATPase, class II, type 9A 


127 


67 


1713 


AAM76495 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36801. 


123 


70 


1713 


AAM63681 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35786. 


123 


70 


1714 


gi8096269 


Nicotiana 
tabacum 


KED 


149 


28 ! 


1714 


gil752736 


Saccharomyces 
cerevisiae 


gene required for phosphoylation of 
oligosaccharides/ has high homology 
with YJR061w 


148 


30 


1714 


gi2292986 


Rattus 
norvegicus 


cyclic nucleotide-gated channel beta 
subunit 


141 


28 


1715 


AAM72995 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 33301. 




Al 


1715 


AAM60359 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 32464. 


158 


47 


1715 


gi|13539605| 
emb|CAC35 


Paramecium 
tetraurelia 


cyclophilin-RNA interacting protein 


144 


45 
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733.1| 










1716 


AAM71015 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31321. 


251 


64 


1716 


AAM58517 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID ' 
NO: 30622. 


251 


64 


1716 


AAU19766 


Homo sapiens 


HUMA- Human novel extracellular 
matrix protein, Seq ID No 416. 


161 


44 


1718 


gil420924 


Zea mays 


INI 


75 


27 


1718 


gi|14521970| 
refJNP 1274 
47.1| 


Pyrococcus 
abyssi 


O-sialoglycoprotein endopeptidase 


73 


35 


1719 


gi205 13851 


Hordeum 
vulgare 


BPM 


74 


35 


1719 


gi21039126 


Cryptosporidium 
parvum 


60 kDa glycoprotein 


74 


26 


1719 


gi207158 


Rattus 
norvegicus 


bigtau 


73 


36 


1720 


gil8181943 


Caenorhabditis 
elegans 


heparan sulfate GlcNAc transferase-I/II 


67 


34 


1720 


gi2058699 


Caenorhabditis 
elegans 


multiple exostoses homolog 2 


67 


34 


1720 


gi|17554740| 
ref]NP 4993 
68.1| 


Caenorhabditis 
elegans 


MULTIPLE EXOSTOSES 
HOMOLOG 2 


67 


34 


1721 


AAM69150 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29456. 


200 


38 


1721 


AAM56769 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28874. 


200 


38 


1721 


gi4 185947 


Human 
endogenous 
retrovirus K 


pol protein 


196 


38 


1722 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


615 


60 


1722 


gil8676710 


Homo sapiens 


FLJ00254 protein 


592 


60 


1722 


gi|20469453| 
ref)XP 1140 
40.1| 


Homo sapiens 


similar to FU00254 protein 


283 


50 


1723 


gil3881755 


Mycobacterium 

tuberculosis 

CDC1551 


cation efflux system protein 


74 


30 


1724 


AAG78866 


Homo sapiens 


SHAN- Human zinc finger protein 15. 


141 


68 


1724 


ABB 17928 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6585. 


99 


53 


1724 


gi|21295712| 

gb|EAA078 

57.1| 


Anopheles 
gambiae str. 
PEST 


agCrloil 






1725 


gi2 1104340 


Homo sapiens 


obscurin 


1586 


83 


1725 


gi7024535 


Gallus gallus 


structural muscle protein titin 


207 


24 


1725 


gil513030 


Gallus gallus 


connectin/titin 


207 


24 


1727 


AAE19162 


Homo sapiens 


THOR/ Human kinase polypeptide 
(PKIN-20). 


1096 


99 
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1727 


gi2736151 


Rattus 
norvegicus 


mytonic dystrophy kinase-related 
Cdc42-binding kinase 


902 


78 


Mil 


gil695873 


Homo sapiens 


ser-thr protein kinase PK428 


896 


77 


1728 


AAY99411 


Homo sapiens 


GETH Human PR01487 (UNQ756) 
amino acid sequence SEQ ID NO:260. 


862 


67 


1728 


gi 156 17453 


Homo sapiens 


chondroitin synthase 


862 


67 


1728 


AAE15959 


Homo sapiens 


EUMO- Human 4589624/92-303 
protein, member of Fringe and Brainiac 
family. 


761 


79 


1729 


gi|l 5804980] 
ref|NP 2909 
60.1| 


Escherichia coli 

0157:H7 

EDL933 


Uncharacterized conserved protein 


71 


33 


1731 


gil4268490 


Musca domestica 


hunchback 


82 


33 


1731 


AAM93401 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3002. 


76 


27 ! 


1731 


gi2076606 


Musca domestica 


hunchback zinc finger protein 


73 


30 


1732 


AAY91949 


Homo sapiens 


INCY- Human cytoskeleton associated 
protein 4 (CYSKP-4). 


1047 


57 


1732 


ABB90754 


Homo sapiens 


UY JO Human Tumour Endothelial 
Marker polypeptide SEQ ID NO 240. 


1043 


57 


1732 


gi619577 


Gallus gallus 


cardiac muscle tensin 


1043 


56 


1733 


gi3090889 


Homo sapiens 


synapsin Ida 


70 


38 


1733 


gi6572355 


Homo sapiens 


cE86D10.1 (synapsin EI) 


70 


38 


1733 


gi|19924105| 
ref]NP 0034 
81.2| j 


Homo sapiens 


synapsin III, isoform Ilia 


70 


38 


1734 


AAB85144 


Homo sapiens 


HUMA- Human NKCR polypeptide 
(clone ID HMSOM53). 


1506 


93 


1734 


gi4973126 


Mus museums 
castaneus 


high affinity immunoglobulin gamma 
Fc receptor I 


490 


39 


1734 


gi4973124 


Mus musculus 


high affinity immunoglobulin gamma 
Fc receptor I 


489 


39 


1735 


gi|15597595| 
ref|NP 2510 
89.1| 


Pseudomonas 
aeruginosa 


pyoverdine synthetase D 


69 


30 


1736 


gil4488302 


Oryza sativa 


Putative transposon protein 


81 


24 


1736 


gi3851516 


Phytophthora 
infestans 


cyst germination specific acidic repeat 
protein precursor 


72 


33 


1736 


gi|14488302| 
gb|AAK638 
83.1|AC074 
105 12 


Oryza sativa 


Putative transposon protein 


si ; 


24 


1737 


AAB85357 


Homo sapiens 


INCY- Human phosphatase (PP) (clone 
ID 3402521 CD1). 


1591 


100 ! 


1737 


gi21205864 


Homo sapiens 


T-cell activation protein phosphatase 
2C; TA-PP2C 


1591 


100 


1737 


gi21464366 


Drosophila 
melanogaster 


RE06653p 


758 


52 


1738 


gi7271811 


Drosophila 
melanogaster 


GTPase activating protein 


292 


38 i 


1738 


AAM76430 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36736. 


246 


100 


1738 


AAM63615 


Homo sapiens 


MOLE- Human brain expressed single 


246 


100 
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exon probe encoded protein SEQ ID 

NO: 35720. 






1739 


ABB50365 


Homo sapiens 


HUMA- Human secreted protein 
encoded bv gene 65 SEQ ID NO:3 13. 


272 


87 


1739 


AAW88598 


Homo sapiens 


HUMA- Secreted protein encoded by 
gene 65 clone HFVHY45. 


272 


87 


1739 


ABB50764 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 65 SEQ ID NO:716. 


143 


92 


1740 


fi i2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


1210 


58 


1740 


gi|10834720| 
gb|AAG237 
90.1|AF258 
587 1 


Homo sapiens 


PP565 


274 


80 


1740 


gi|385615|gb 
|AAB26708. 


Mussp. 


fibulin gene homolog 


248 


75 


1741 


ABB90748 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker polypeptide SEQ ID NO 228. 


2116 


97 


1741 


R il5987493 


Homo sapiens 


tumor endothelial marker 6 


2116 


97 


1741 


ABB90754 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker polypeptide SEQ ID NO 240. 


530 


37 


1742 


ABB11753 


Homo sapiens 


HYSE- Human NOV/plexin-Al 
homologue, SEQ ID NO:2123. 


291 


90 


1742 


gil665757 


Mus musculus 


plexin 1 


291 


90 


1742 


gi6010217 


Homo sapiens 


NOV/plexin-Al protein 


291 


90 


1743 


AAM79514 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3160. 


149 


90 


1743 


AAM78530 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1192. 


149 


90 


1743 


gil244510 


Homo sapiens 


p3 1 1 protein 


149 


90 


1744 


AAG93324 


Homo sapiens 


NISC- Human protein HP 1 0370. 


83 


41 


1744 


gi2 1064771 


Drosophila 
melanogaster 


RH61467p 


83 


46 


1744 


gil8676554 


Homo sapiens 


FU00174 protein 


77 


41 


1745 


gi4128039 


Homo sapiens 


TL132 protein 


81 


29 


1745 


gil7983U8 


Brucella 
melitensis 


METAL DEPENDENT HYDROLASE 


74 


23 


1745 


AAU75578 


Homo sapiens 


UYNA- Human ubiquitin specific 
protease 10 (USP10). 


71 


31 


1746 


gil 5074154 


Sinorhizobium 
meliloti 


PUTATIVE FATTY 
ACID/PHOSPHOUPID SYNTHESIS 
PROTEIN 


76 


25 


1746 


gil869833 


human 
herpesvirus 2 


myristylated tegument protein 


75 


27 


1746 


gi20516045 


Thermoanaeroba 
cter 

tengcongensis 


Chemotaxis response regulator CheB, 
consists of CheY-like receiver domain 
and a methylesterase (demethylase) 
domain 


69 


20 


1747 


gil 8025496 


cercopithicine 
herpesvirus 15 


EBNA-1 


124 


37 


1747 


gi5821153 


Homo sapiens 


RNA binding protein 


123 


29 


1747 


gi6649242 


Homo sapiens 


splicing coactivator subunit SRm300 


123 


29 


1748 


gi|4321764|g 
blAAD1581 


Mus musculus 


MAP kinase kinase 7 alpha 2 


65 


30 
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9.11 










1748 


gi|20859704| 
ref]XPJ339 
86.1| 


Mus musculus 


mitogen activated protein kinase kinase 
7 


65 


30 


1748 


gi|4321768|g 
b|AAD1582 
1.11 


Mus musculus 


MAP kinase kinase 7 beta 2 


65 


30 


1749 


AAB50964 


Homo sapiens 


GETH Human PR01313 protein. 


439 


89 


1749 


AAB47290 


Homo sapiens 


GETH PR01313 polypeptide. 


439 


89 


1749 


AAB24431 


Homo sapiens 


GETH Human PR01313 protein 
sequence SEQ ID NO:216. 


439 


89 


1750 


AAU00502 


Homo sapiens 


MILL- Human TANGO 437 protein. 


115 


91 


1750 


gi20384654 


Homo sapiens 


two-pore calcium channel protein 2 


115 


91 


1750 


AAM91059 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 18652. 


93 


64 


1751 


£10440494 


Homo sapiens 


FLJ00092 protein 


252 


97 


1751 


AAM40956 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5887. 


80 


30 


1751 


gi|10440494| 
dbj|BAB157 
80.1| 


Homo sapiens 


FLJ00092 protein 


252 


97 


1752 


gil 5980036 


Yersinia pestis 


2-dehydro-3-deoxyphosphooctonate 
aldolase 


77 


46 


1752 


gil 1322261 


Diceros bicomis 


alpha adrenergic receptor 2B j 


74 


26 


1752 


gi2051624O 


Thermoanaeroba 
cter 

tengcongensis 


methylaspartate mutase 


73 


25 


1753 


gi!9684014 


Homo sapiens 


similar to brain-specific angiogenesis 
inhibitor 3 (H. sapiens) 


1387 


99 


1753 


AAB88367 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0101. 


1380 


99 


1753 


gil469936 


Mus musculus 


FGF-binding protein 


158 


29 


1754 


AAB01397 


Homo sapiens 


INCY- Neuron-associated protein. 


435 


92 


1754 


gi21218140 


Homo sapiens 


rab effector MYRIP 


435 


92 


1754 


gi21320161 


Mus musculus 


exophilin 8 


378 


77 


1755 


AAM74815 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35121. 


253 


75 


1755 


AAM62013 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 34118. 


253 


75 


1755 


AAM70390 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 30696. 


228 


62 


1756 


gi6460201 


Deinococcus 
radio durans 


phenylacetic acid degradation protein 
PaaA 


85 


27 


1756 


gi3309543 


Takifugu 
rubripes 


MLL 


79 


34 


1756 


AAT10059_ 
aal 


Homo sapiens 


USSH erbB-3 cDNA clone E3-16. 


74 


31 


1757 


gil 8676406 


Homo sapiens 


FU00021 protein 


70 


36 


1758 


gil3423395 


Caulobacter 
crescentusCB15 


NADH dehydrogenase I, M subunit 


78 


37 
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1758 


gi|17506337| 
refJNP 4913 
90.1| 


Caenorbabditis 
elegans 


D1007.15.p 


82 


24 


1758 


gi|16126181| 
refJNP 4207 
45.1| 


Caulobacter 
crescentus CB15 


NADH dehydrogenase I, M subunit 


78 


37 


1759 


gil9881193 


chimpanzee 
cytomegalovirus 


transcriptional transactivator TRS1 


83 


29 


1759 


gil9881161 


chimpanzee 
cytomegalovirus 


transcriptional transactivator IRS1 


83 


29 


1759 


gi556297 


Mus musculus 


alpha- 1 type IV collagen 


81 


33 


1760 


gil8033185 


Danio rerio 


UNC45-related protein 


702 


79 


1760 


AAG77802 


Homo sapiens 


HUMA- Human HOGEN50 
serine/threonine phosphatase protein 
sequence. 


603 


65 


1760 


AAM40290 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3435. 


603 


65 


1761 


gi6634123 


Drosophila 
melanogaster 


SoxNeuro 


70 


24 


1762 


gi|14245700| 
dbjlBAB561 
42.1| 


Giardia 
intestinalis 


kinesin-like protein 4 


69 


26 


1762 


gi|165011|gb 
1AAA31246. 
11 


Oryctolagus 
cuniculus 


eucaryotic release factor (eRF) 


69 


24 


1762 


gi|15559188I 
emb|CAC03 
424.2| 


Homo sapiens 


dJ45P21.3 (butyrophilin, subfamily 3, 
member Al) 


69 


26 


1763 


AAM93661 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3536. 


186 


80 


1763 


AAM64398 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 36503. 


154 


76 


1763 


gi|20556958| 
refpCP_0615 
62.5| 


Homo sapiens 


similar to PAM COOH-terminal 
interactor protein 1 


73 


43 


1764 


AAU17223 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 788. 


211 


87 


1765 


gil334546 


Podospora 
anserina 


Dod COI il3 grp IB protein 


71 


37 


1765 


gi5679307 


Mus musculus 


RORgamma t 


70 


27 


1765 


gi4186077 


Mus musculus 


ROR gamma T protein 


70 


27 


1766 


gil7864081 


Mus musculus 


PPAR gamma coactivator-lbeta protein 


74 


26 


1766 


gi44795 


Methanococcus 
voltae 


polyferredoxin 


71 


28 


1766 


gil4279670 


Lycopersicon 
esculentum 


verticillium wilt disease resistance 
protein 


71 


31 


1768 


AAE06588 


Homo sapiens 


SAGA Hunan protein having 
hydrophobic domain, HP10778. 


165 


100 


1768 


AAM40979 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5910. 


165 


100 


1768 


AAB24542 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 27 SEQ ID 
NO: 168. 


73 


30 
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1769 


gi6174840 


Acbromobacter 

xylosoxidans 

subsp. 

xylosoxidans 


low-specificity D -threonine aldolase 


78 


33 


1769 


gil 6769806 


Drosophila 
melanogaster 


SD02660p 


75 


23 


1769 


gi 1098473 


Rattus 
norvegicus 


insulin-like growth factor binding 
protein 


73 


31 


1770 


AAP94684 


Homo sapiens 


CHIL Amino acid sequence encoded 
by part of human mannose binding 
protem(hMBP) genomic DNA. 


79 


56 


1770 


gi|15790548| 
refjNP 2803 
72.1| 


Halobacterium 
sp. NRC-1 


cobyric acid synthase; CbiP 


69 


36 


1770 


gi| 11467609| 
ref]NP 0506 
61.1| 


Guillardia theta 


Clp protease ATP binding subunit 


69 


27 


1772 


gi5 532460 


Shigella flexneri 


ShiF 


66 


32 


1773 


gil 1544663 


Arabidopsis 
thaliana 


PTPKIS1 


75 


42 


1773 


gil 1595504 


Arabidopsis 
thaliana 


PTPKIS1 protein 


75 


42 


1773 


gil 8389331 


Mus musculus 


2',5'-oligoadenylate synthetase-like 10 


73 


42 


1774 


AAM06519 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 
NO: 250. 


414 


90 


1774 


gi|18552248| 
ref]XP 0925 
10.11 


Homo sapiens 


similar to latent transforming growth 
factor beta binding protein 1; latent 
TGF beta binding protein 


69 


37 


1775 


* inn ii ft/"* j 

gi48 84924 


Rangiferine 
herpesvirus 1 


glycoprotein C 


67 


60 


1775 


AAB94152 


Homo sapiens 


TTT1T T" T W i • riT^/N 

HELI- Human protein sequence SEQ 
ID NO: 14435. 


65 


34 


1775 


AAB93253 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:12271. 


65 


34 


1776 


gil 3424 176 


Caulobacter 
crescentus CB15 


N-carbamyl-L-arrrino acid 
amidohydrolase 


89 


24 


1776 


gi514267 


Homo sapiens 


proto-oncogene tyrosine-protein kinase 


86 


29 


1776 


gi28237 


Homo sapiens 


pl50 protein (AA 1-1 130) 


84 


28 | 


1777 


gi63370 


Gallus gallus 


dystrophin (AA 1 - 3660) 


68 


31 


1777 


'In a i /Tin 

gi|3046783|e 
mb|CAA680 
3j.1| 


Scyhorhinus 
canicula 


dystrophin 


67 


29 


1777 


gi|2342682|g 

b[AAB7040 

6.1| 


Arabidopsis 

thaliana 


Contains similarity to Rattus AMP- 
activated protein kinase (gb|X95577). 


67 


31 


1778 


AAE16176 


Homo sapiens 


INCY- Human G-protein coupled 
receptor / \OvyJKx>i^- / ) protein. 


1419 


100 


1778 


AAE18021 


Homo sapiens 


CURA- Human G-protein coupled 
receptor-8a (GPCR-8a) protein. 


1419 


100 


1778 


AAG72411 


Homo sapiens 


YEDA Human OR-like polypeptide 
query sequence, SEQ ID NO: 2092. 


1419 


100 


1779 


AAM76040 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36346. 


93 


48 
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No. 
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% 
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1779 


AAM63227 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35332. 


93 


48 


1779 


gil2620576 


Bradyrhizobium 
japonicum 


ID342 


87 


24 


1780 


gi2459833 


Rattus 
norvegicus 


Maxpl 


81 


il 


1780 | 


AAB65650 


Homo sapiens 


SUGE- Novel protein kinase, SEQ ID 
NO: 177. 


•80 


35 


1780 


AAM39805 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2950. 


80 


36 


1781 


gi4877963 


Mus mosculus 


NF-kappaB inducing kinase 


69 


k\ 
5y 


1781 


gil5077865 


Mus musculus 


bullous pemphigoid antigen 1-b 


67 


35 


1781 


gil 5077863 


Mus musculus 


bullous pemphigoid antigen 1 -a 


67 


35 


1782 


gi4138265 


Nicotiana 
tabacum 


Avr9 elicitor response protein 


76 


27 


1782 


gil2725153 


Lactococcus 
lactis subsp. 
Iacti3 


50S ribosomal protein L3 


75 


32 


1782 


AAB21008 


Homo sapiens 


ENCY- Human nucleic acid-binding 
protein, NuABP-12. 


73 


32 


1783 


gi3947714 


Streptococcus 
agalactiae 


initiation factor IF2 


86 


20 


1783 


gi9558387 


Streptococcus 
agalactiae 


initiation factor 2 


86 


20 


1783 


gi9558369 


Streptococcus 
agalactiae 


initiation Factor 2 


86 


20 


1786 


gi435855 


Mus sp. 


CREB-binding protein; CBP 


75 


22 


1786 


gi2911464 


Leishmania 
tarentolae 


sodium stibogluconate resistance 
protein 


75 


34 


1786 


gil9547887 


Mus musculus 


CREB -binding protein 


75 


22 


1787 


gi3747099 


Mus musculus 


Clq-related factor 


616 


61 


1787 


gil4278927 


Mus musculus 


gliacolin 


615 


64 


1787 


gil0566471 


Mus musculus 


Gliacolin 


615 


64 


1788 


gi|21291197| 

gb|EAA033 

42.11 


Anopheles 
gambiae str. 
PEST 


agCP7579 


71 


20 


1788 


gi|20803964| 
emb|CAD31 
541.11 


Mesorhizobium 
loti 


HYPOTHETICAL PROTEIN 


69 


43 


1789 


AAM41125 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6056. 


320 


80 


1789 


AAM39339 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2484. 


320 


80 


1789 


AAM79857 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3503. 


320 


80 


1790 


gil 143585 


Paracentrotus 
lividus 


z aipna nonjiar coiiagen 


<Q 


23 


1791 


gi9837427 


Lytechmus 
variegatus 


embryonic blastocoelar extracellular 
matrix protein precursor 


116 


34 


1791 


gil4089698 


Mycoplasma 
pulmonis 


OLIGOPEPTIDE ABC 
TRANSPORTER PERMEASE 
PROTEIN 


71 


23 


1791 


gi6572111 


Bartonella 


riboflavin synthase alpha chain 


69 


29 
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% 
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quintana 








1792 


gi|4506023|r 
ef]NP 0027 
10.1| 


Homo sapiens 


protein phosphatase 2, regulatory 
subunit B (B56), gamma isoform 


68 


39 


1793 


AAM71170 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31476. 


180 


82 


1793 


AAM58664 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30769. 


180 


82 


1793 


AAM65679 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ED 
NO: 37784. 


168 


71 


1794 


AAG00072 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4153. 


125 


80 


1794 


AAW34618 


Homo sapiens 


IMUT- Human C3 protein mutant D V- 
7N. 


125 


80 


1794 


AAW34617 


Homo sapiens 


IMUT- Human C3 protein mutant DV- 
6. 


125 


80 


1795 


AAY05069 


Homo sapiens 


SMIK Human PIGR-2 protein 
sequence. 


1055 


85 


1795 


gi396170 


Homo sapiens 


CMRF-35 antigen 


406 


45 


1795 


gil8490143 


Homo sapiens 


CMRF35 leukocyte immunoglobulin- 
like receptor 


406 


45 


1796 


gi|6723273|d 
bj[BAA8965 
9.1| 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


421 


41 


1796 


gi|13940448| 
gb|AAK503 
81.1JU43202 
2 


Murine leukemia 
virus 


pol precursor protein 


421 


41 


1796 


gi|331995|gb 
|AAB0309L 
11 


AK V murine 
leukemia virus 


gag-pol polyprotein (tag amber codon 
at 2250-2252 inserts Gin in Mo-MuLV) 


421 


41 


1797 


gi21411325 


Homo sapiens 


Similar to LOC205 103 


260 


73 


1797 


gi|4835878|g 
b(AAD3028 
0.1(AF1348 
38 1 


Homo sapiens 


endocytic receptor Endol80 


77 


31 


1797 


gi|16076075| 
erab|CAC94 
295.1| 


Leishmania 

donovani 

donovani 


trypanotbione reductase 


70 


30 


1798 


gi927721 


Saccharomyces 
cerevisiae 


Siplp: SNF1 protein kinase substrate; 
YDR422C;CAI: 0.13 


72 


34 


1798 


gil72604 


Saccharomyces 
cerevisiae 


protein kinase 


72 ; 


34 


1798 


gi|6320630)r 
eflNP 0107 
10.11 


Saccharomyces 
cerevisiae 


SNF1 protein kinase substrate; Siplp 


72 


34 


1799 


gi|20839768| 
refjXP 1303 
11.1| 


Mus museums 


similar to GDP-fucose transporter 1 


71 


29 


1801 


gi|17461642| 
refpCP 0662 


Homo sapiens 


similar to Ig kappa chain 


78 


23 
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No. 
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% 
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49.1| 










1801 


gi|6325342|r 
ef|NP 0154 
10.11 


Saccharomyces 
cerevisiae 


Protein required for cell viability, 
Ypr085cp 


76 


22 


1801 


gi|9635081|r 
efJNP 0578 
09.11 


Gallid 

herpesvirus 2 


UL47 


74 


2o 


1802 


AAB94148 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:14427. 


250 


56 


1802 


AAG64564 


Homo sapiens 


SHAN- Human zinc-finger protein 60. 


OCA 

250 


JO 


1802 


AAM79356 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3002. 


250 


56 


1803 


AAW81754 


Homo sapiens 


BOEF Human Fanconi anaemia- 
associated gene II protein. 


631 


85 


1803 | 


gi2407911 


Homo sapiens 


differentially expressed in Fanconi 
anemia 


555 


74 


1803 


gi6013073 


Mus musculus 


HemT-3 protein 


89 


24 


1805 


gil4189735 


Homo sapiens 


ATP-binding cassette transporter 
family A member 12 


1508 


90 


1805 


gil943947 


Bos taurus 


ABC transporter 


404 


31 


1805 


AAZ94734_ 
aal 


Homo sapiens 


FARB Human ATP binding cassette 
ABCA1 (ABC1) cDNA. 


395 


33 


1806 


AAU12234 


Homo sapiens 


GETH Human PRO4350 polypeptide 
sequence. 


859 


100 


1806 


AAA96344_ 
aal 


Homo sapiens 


GETH cDNA encoding a novel 
polypeptide designated PR04357. 


498 


48 


1806 j 


AAU12445 


Homo sapiens 


GETH Human PR04357 polypeptide 
sequence. 


498 


48 


1807 


gil90396 


Homo sapiens 


profilaggrin 


76 


29 


1808 


AAB88367 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0101. 


74 


30 


1808 


gil9684014 


Homo sapiens 


similar to brain-specific angiogenesis 
inhibitor 3 (H. sapiens) 


74 


30 


1808 


gi|18576362| 
refpCP 0844 
81-1| 


Homo sapiens 


similar to fibroblast growth factor 
binding protein 1 


74 


30 


1809 


gi530876 


Chlamydomonas 
reinhardtii 


amino acid feature: Rod protein 
domain, aa 266 .. 468; amino acid 
feature: globular protein domain, aa 32 
.. 265 


126 


35 


1809 


gi6578849 


Myxococcus 
xanthus 


FrgA 


126 


29 


1809 


gi2429362 


Santalum album 


proline rich protein 


122 


27 


1810 


gil7428288 


Ralstonia 
solanacearum 


PROBABLE CATION- 
TRANSPORTING ATPASE 
LIPOPROTEIN TRANSMEMBRANE 


75 


28 


1810 


gi21483422 


Drosophila 
melanogaster 


T T\1A% Alt* 


71 


29 


1810 


ABB90042 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 2418. 


70 


32 


1811 


gi|20915248| 
refJXP 1451 
60.11 


Mus musculus 


similar to Collagen alpha 1(VI) chain 
precursor 


148 


74 


1812 


gi2104558 


Rattus 


CCA3 


1150 


90 
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No. 
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Description 


Score 


% 
lueiiuiy 






norvegicus 








1812 


AAB64963 


Homo sapiens 


ROSE/ Human secreted protein 
sequence encoded by gene 24 SEQ ID 
NO:141. 


172 


37 


1812 


gil2963869 


Mus musculus 


gene trap ankyrin repeat containing 
protein 


1 n 
1 /z 




1813 


AAB65201 


Homo sapiens 


GETH Human PRO1009 (UNQ493) 
protein sequence SEO ID NO:194. 


ZUo 


i nn 

1UU 


1813 


AAY66678 


Homo sapiens 


GETH Membrane-bound protein 
PRO1009. 


ZUo 


IftO 
1UU 


1813 


AAB24068 


Homo sapiens 


GETH Human PRO 1 009 protein 
sequence SEQ ID NO:36. 


ZUo 


inn 


1815 


AAG89314 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 434. 


1 01 


1 HA ' 
1UU 


1815 


gi6460052 


Deinococcus 
radiodurans 


dipeptidyl peptidase IV-related protein 


00 


OU 


1816 


gil052594 


Drosophila 
melanogaster 


trithorax protein trxl 


75 


26 


1816 


gil052593 


Drosophila 
melanogaster 


trithorax protein trxll 


75 


26 


1816 


gil58818 


Drosophila 
melanogaster 


zinc-binding protein 


75 


26 


1817 


AAB49765 


Homo sapiens 


HELI- Human proliferation 
differentiation factor arnino acid 
sequence. 


229 


94 


1817 


AAB88393 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0137. 


229 


94 


1817 


gil8446895 


Drosophila 
melanogaster 


AT05866p 


73 


25 


1818 


gi6573212 


Giardia 
intestinalis 


variant-specific surface protein H7-1 


73 


3Z 


1818 


gil59143 


Giardia 
intestinalis 


variant-specific surface protein H7 


73 


32 


1818 


gil5144254 


Micrurus 
corallinus 


neurotoxin homologue 8 


72 


32 j 


1819 


gi!61857 


Tetrahymena 
mermophila 


surface antigen 


69 


35 


1821 


gi913964 


Carcinoscorphis 
roumdicauda 


factor C 


oU 


1(L 

zo 


1821 


gi217397 


Tachypleus 
tridentatus 


limulus factor C precursor 


pri 

oO 


If, 

zo 


1821 


gil 8542425 


Tachypleus 
tridentatus 


factor C precursor 


80 


26 


1822 


Ki9309473 


Mus musculus 


DNMT1 associated protein- 1 


74 


37 


1822 


gil666895 


Homo sapiens 


CHL1 protein 


74 


23 


1822 


gil6923930 


Mus musculus 


MAT 1 -mediated transcriptional 


74 


ii 


1823 


gi9058659 


Canis ramiliaris 


skeletal muscle chloride channel C1C-1 


73 


34 


1823 


gi433182 


Drosophila 
melanogaster 


receptor protein tyrosine phosphatase 


72 


26 


1823 


gi20429105 


Paracoccus 

zeaxanthinifacie 

ns 


decaprenyl diphosphate synthase 


72 


27 


1824 


Ril3374178 I Musrnuscuhis 


TAFII140 protein 


612 


88 
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NO: 
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No. 


Species 


Description 


Score 


/o 

luenuiy 


1824 


gil7861888 


Drosophila 
melanomas ter 


GM10839p 


246 


49 


1824 


gi6634096 


Drosophila 
melanogaster 


BIP2 protein 


242 


48 


1825 


gil 6605480 


Homo sapiens ! 


G6b-C protein 


1159 


100 


1825 


gil 6605484 


Homo sapiens 


G6b-E protein 


1 AAA 


on 
yu 


1825 


gi5304877 


Homo sapiens ! 


imnmnoglobulin receptor 


1003 


83 


1826 


AAB94636 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:15515. 


105 


37 


1826 


AAU 15903 


Homo sapiens 


HUMA- Human novel secreted protein, 
Seq ID 856. 


1 AC 


17 
Jl 


1826 


gi2 1430928 


Drosophila 
melanogaster 


SD27341p 


93 


Dy 


1827 


AAR33270 


Homo sapiens 


WIST- T cell receptor alpha chain 
clone alpha 1.3. 


329 


AO 

yz 


1827 


£1806100 


Homo sapiens 


T cell receptor alpha chain 


329 


yz 


1827 


£2358032 


Homo sapiens 


TCRAV8S3 


329 


92 


1828 


gi20513851 


Hordenm 
vulgaie 


BPM 


73 


45 


1828 


AAO01897 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15789. 


70 


35 


1828 


AAE16477 


Homo sapiens 


OSTE- Human collagen alpha 1 (IT) 
protein. 


69 


31 


1829 


AAG66837 


Homo sapiens 


SHAN- Human ATP-dependent serine 
proteinase 31. 


356 


100 


1829 


AAG66838 


Homo sapiens 


SHAN- Human ATP-dependent serine 
proteinase 31 N-tenninal peptide. 


89 


100 


1829 


£5881591 


Gallus gallus 


homeodomain protein 


77 


38 


1830 


AAB94294 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:14745. 


951 


99 


1830 


£10504968 


Drosophila 
melanogaster 


rho guanine nucleotide exchange factor 
4 


180 


22 


1830 


gil6197921 


Drosophila 
melanogaster 


LD03170p 


180 


22 


1831 


ABB12353 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 107. 


199 


30 


1831 


£20452161 


Cards familiaris 


retinitis pigmentosa GTPase regulator 


143 


24 


1831 


£2062609 


Xenopus laevis 


middle molecular weight neurofilament 
protein NF-M(l) 


140 


24 


1832 


AAB29778 


Homo sapiens 


RHOD- Human MSF-derived 
trmonectin. 


148 


18 


1832 


£142161. 


Anaplasma 
marginale 


surface antigen Amfl 05 


141 


25 


1832 


£4808177 


Drosophila 
subobscura 


largest subunit of the RNA polymerase 
II complex 


141 


20 


1833 


AAM66321 


Homo sapiens 


MOLE- Human bone marrow 
expressed prooe encoaea proiem ojdv^ 
ID NO: 26627. 


424 


51 


1833 


AAM53933 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26038. 


424 


51 


1833 


£|6723273|d 
bj|BAA8965 
9.11 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


357 


47 
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Description 
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1834 


AAM88756 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 16349. 


208 


too 


1834 


gi20417 


Persea americana 


cellulase 


77 


34 


1834 


gil53337 


Streptomyces 
tenebrarius 


kanamycin-apramycin resistance 
methylase 


69 


26 


1837 


AAY02893 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 92. 


76 


41 


1837 


AAY99429 


Homo sapiens 


GETH Human PR01563 (UNQ769) 
amino acid sequence SEQ ID NO: 3 1 7. 


73 


35 


1837 


gi6634084 


Drosophila 
melanogaster 


malate dehydrogenase (NADP- 
dependent oxaloacetate 
decarboxylating), malic enzyme 


73 


39 


1838 


gi2865602 


Saccharopolyspo 
ra sp. 


Sapl M2 methyltransferase 


77 


37 


1838 


gi3089358 


Rattus 
norvegicus 


MARRLC2A 


75 


33 


1838 


gi|2865602|g 

b|AAC9718 

2.1| 


Saccharopolyspo 
ra sp. 


Sapl M2 methyltransferase 


77 


37 


1839 


AAM69149 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29455. 


154 


96 


1839 


AAM56768 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28873. 


154 


96 


1839 


AAW96209 


Homo sapiens 


SM1K Amyloid precursor protein 
(APP) C-terminal fragment 


102 


78 


1840 


gi9946563 


Pseudomonas 
aeruginosa 


probable type II secretion system 
protein 


81 


36 


1840 


gi21 108565 


Xanthomonas 
axonopodis pv. 
citri str. 306 


pseudouridylate synthase 


75 


35 


1840 


ABB04714 


Homo sapiens 


SHAN- Human PP1744 protein SEQ 
IDNO:23. 


74 


31 


1841 


gil491949 


Molluscum 
contagiosnm 
virus subtype 1 


MC006L 


85 


30 


1841 


AAM42085 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 7016. 


81 


27 


1841 


AAM40299 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3444. 


81 


27 


1842 


gi20381413 


Homo sapiens 


Similar to LOC160680 


216 


44 


1842 


gil3592175 


Leishmania 
major 


PPg3 


144 


24 


1842 


gi5420387 


Leishmania 
major 


proteophosphoglycan 


140 


23 


1843 


AAB87181 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 E41D variant, SEQ ED 
NO:231. 


278 


42 


1843 


AAB87128 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349, SEQ ID NO: 130. 


278 


42 


1843 


AAB87179 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 121K variant, SEQ ID 


276 


41 
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NO:227. 






1844 


AAE14341 


Homo sapiens 


INCY- Human protease PRTS-6 | 
protein. ! 


886 


93 


1844 


gil 6768276 


Drosophila 
melanogaster 


GH27809p 


290 


41 


1844 


gi2655204 


Mus musculus 


ubiquitin-specific protease 


258 


35 


1846 


AAY88300 


Homo sapiens 


MILL- Human TANGO 187-3 protein. 


1334 


90 


1846 


gil3097780 


Homo sapiens 


Similar to RIKEN cDNA 2810037C14 
gene 


1326 


90 


1846 


AAY88296 


Homo sapiens 


MILL- Human TANGO 1 87-2/3 
protein. 


1312 


87 


1847 


AAG74984 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:5748. 


75 


32 


1847 


gil7352449 


Rattus 
norvegicus 


ErbB3/Her3 precursor 


74 


38 


1847 


gi|20860870| 
ref|XP 1256 
64.1| 


Mus musculus 


similar to H4(D10S170) protein 




32 


1848 


gi3 123530 


Fowlpox virus 


rpDL, orthologue of vaccinia I3L 


75 


27 


1848 


gi5902659 


Drosophila 
melanogaster 


ring canal protein 


70 


27 


1848 


gi|181 10218| 
ref|NP 4765 
89.2| 


Drosophila 
melanogaster 


kel-P2 


70 


27 


1849 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


614 


78 


1849 


AAM65715 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26021. 


548 


73 


1849 


AAM53338 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25443. 


548 


73 


1850 


gil0999071 


Lophognathus 
longirostris 


NADH dehydrogenase subunit 2 


74 


23 


1850 


gil8537243 


Human 

immunodeficienc 
y virus type 1 


envelope glycoprotein 


74 


29 


1850 


gi|1099907J| 
gb|AAG006 
22.2|AF128 
462 2 


Lophognathus 
longirostris 


NADH dehydrogenase subunit 2 


74 


23 


1851 


gi|17448210| 
reipCP 0685 
03.1| 


Homo sapiens 


similar to 60 kDa heat shock protein, 
mitochondrial precursor (Hsp60) (60 
kDa chaperonin) (CPN60) (Heat shock 
protein 60) (HSP-60) (Mitochondrial 
matrix protein PI) (P60 lymphocyte 
protein) (HuCHA60) 


72 


28 


1852 


gil 164937 


Saccharomyces 
cerevisiae 


YOR3160w 


74 


1 1 
31 


1852 


gi3 176662 


Arabidopsis 
thaliana 


Similar to mannosyi-oligosaccharide 
glucosidase gbfX87237 from Homo 
sapiens. 


73 


31 


1852 


gil3398928 


Arabidopsis 
thaliana 


alpha-glucosidase 1 


73 


31 


1853 


gi|20889364| 


Mus musculus | similar to hepatitis A virus cellular 


76 


36 
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% 
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< ■ — - — - 




refJXP 1384 
29.1| 




receptor 1; T cell immunoglobin 
domain and mucin doamin protein 1 






1853 


gi|21288202| 

gb|EAA005 

23.1| 


Anopheles 
gambiae str. 
PEST 


agCP9342 


71 


32 


1854 


AAB88481 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0251. 


776 


99 


1854 


AAE03835 


Homo sapiens 


HUMA- Human gene 18 encoded 
secreted protein HFKHW50, SEQ ID 
NO: 81. 


776 


99 


1854 


AAE03863 


Homo sapiens 


HUMA- Human gene 18 encoded 
secreted protein HFKHW50, SEQ ID 
NO: 109. 


716 


97 


1855 


gil663748 


Chlamydomonas 
reinhardtii 


dynein heavy chain 7 


82 


29 


1855 


gil663744 


Chlamydomonas 
reinhardtii 


dynein heavy chain 5 


80 


28 


1855 


gil663738 


Chlamydomonas 
reinhardtii 


dynein heavy chain 2 


80 


27 


1856 


gil8032120 


Gallus gallus 


shal-like voltage-gated potassium 
channel 


75 


23 


1856 


gil408569 


Haemophilus 
influenzae 


adhesion and penetration protein 


71 


28 


1856 


gi|18032120| 
gb|AAL5 66 
33.1|AF075 
160 1 


Gallus gallus 


shal-like voltage-gated potassium 
channel 


75 


23 


1857 


AAM67180 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27486. 


129 


44 


1857 


AAM54795 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26900. 


129 


44 


1857 


gil210402551 
re£]NP_6319 
07.1| 


Homo sapiens 


splicing factor, arginine/serine-rich 12 


109 


29 


1858 


gi21392190 


Drosophila 
melanogaster 


RE74758p 


71 


39 


1858 


gi9954108 


Trypanosoma 
cruzi 


RNA binding protein RGGm 


68 


40 


1858 


gi20302994 


Medicago 
truncatula 


nodule-specific glycine-rich protein 1C 


66 


32 


1859 


gi|20536244| 
reflXP 0605 
05.4| 


Homo sapiens 


similar to autoantigen La 


72 


30 


1860 


gi|17541362| 
refpSIP 5024 
09.1| 


Caenorhabdiris 
elegans 


K08E7.5.p 


103 


29 


1860 


gi|17446900| 
ref|XP 0658 
33.1| 


Homo sapiens 


similar to DNA-directed RNA 
polymerase (EC 2.7.7.6) II largest 
chain - Mastigamoeba invertens 
(fragment) 


100 


34 


1860 


gi!9628166|r 
eflNP 0427 


African swine 
fever virus 


CD2 homolog 


98 


30 
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No. 


Snc^c if*; 




Score 


% 

THpntif v 




52.1| 










1861 


AAY70691 


Homo sapiens 


DAND Human membrane attractin-2. 


162 


40 


1861 


AAY70690 


Homo saniens 


DAND Human membrane atlractin-l 


162 


40 


1861 


gil2275390 


Rattus 
norvegicus 


membrane attractin 


162 


40 


1862 


gil0039425 


Equus caballus 


ALR protein 


81 


28 


1 R69 


cn"17^9Q*\91 


ivius muscuius 


ouniiar 10 eiasnn micron oni interlace 
located protein 


CO 


39 


1862 


AAM40414 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
Mn 7^0 


79 


39 


1863 


gi|16588389| 
gb|AAL267 

0 / .l[/vroU*t 
442 1 


Homo sapiens 


B lymphocyte activation-related protein 
BC-1514 


247 


52 


IOOO 


ref]XP_1137 

90 1 1 


Homo sapiens 


similar to B lymphocyte activation- 
related protein BC-15 14 


11/ 


05 


1863 


gi|21301715| 
60.1 1 


Anopheles 
gambiae str. 
PEST 


agCP8366 


85 


41 




A ATM 


Homo sapiens 


HUMA- Human novel secreted protein, 
oeq id oU4. 


1275 


78 




A ATT1 A** 19 


: 

Homo sapiens 


HUMA- Human novel secreted protein, 
Seq ED 1265. 


1 1 o*> 


76 


loot 




Homo sapiens 


ueo i xiurnan secreted protein, lu 
NO: 6135. 




m 


1865 


AAB94953 


Homo sapiens 


HELI- Human protein sequence SEQ 

JUL/ liv.HrrOJ. 


86 


29 


1865 


^i3746787 


Homo sapiens 


SYT interacting protein SIP 


86 


29 


lOUJ 


oil Sft99^ft7 
gll JUZZjU/ 


noino Sapiens 


coacnvaior acuvator 


OU 


90 


1866 


gil7133332 


Nostoc sp. PCC 

719ft 


preprotein translocase SecY subunit 


68 


43 


1866 


gi|13489110| 
rei[iN.r UOO / 
73.11 


Homo sapiens 


gap junction protein, alpha 3, 46kD 
(connexin 46) 


66 


40 


1867 

1 OU 1 


tri706Q3n 
gl/UOj'OU 


XxaLLUa 

norvegicus 


cycuc oivjur snmuiaiea 
phosphodiesterase 


1 01 


o^ 


1867 


AAV54762_ 
aal 


Homo sapiens 


UNIW Human cGS-PDE cDNA DNA 
seqeucne. 


137 


100 


1867 


AAV36157__. 
aal 


Homo sapiens 


UNIW Human cyclic-GMP-nucleotide 
phosphodiesterase cDNA. 


137 


100 


ioOo 




Homo sapiens 


jtIcjLI- Human protein sequence oEQ 
IDNO:18516. 


112 


27 


1 ££K 
IoOo 


A A V01/L47 


Homo sapiens 


nUiviA- Human secreted protein 
sequence encoded oy gene 4o ^.ci^ iu 


1 II 


27 


1868 


AAY91393 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 48 SEQ ID 
NO: 114. 


112 


27 


1870 


AAU07886 


Homo sapiens 


WHED Polypeptide sequence for 
human hspG15. 


1454 


94 


1870 


gi 13603891 


Homo sapiens 


MOVlO-like 1 


1454 


94 


1870 


gil3603857 


Mus musculus 


MOVlO-like 1 


954 


77 


1871 


AAM96652 


Homo sapiens 


HUMA- Human reproductive system 


484 


96 
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% 
Identity 








related antigen SEQ ID NO: 5310. 






1871 


gil 8676652 


Homo sapiens 


FU00225 protein 


433 


95 


10 / 1 




D ClUt- UAia 

tViiKptirji 

UUUCUwa 


liUILUlUOV > ^ 


70 


32 


1872 


AAQ90304_ 
aal 


Homo sapiens 


NISR Human thryoid peroxidase gene, i 


73 


29 


i 


A AWAR7R1 
AAW45/01 




RSRR- Thyroid peroxidase. 1 


73 


29 


1 ft 79 

lo // 


A AR7S6R0 


Hnmn <;aniPTiQ 


NISR Human thryoid peroxidase. 


73 


29 


1873 


AAG03774 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7855. 


228 


90 


1873 


gi338288 


Homo sapiens 


preprosomatostatin I 


228 


90 


1 871 

lo/o 


m7A070Q 


fascicularis 


preprosomatostatin 


228 


90 


15 /D 


A AP3ft41ft 


T-Tnmn ^aniens 


DAND Nearly complete pi 07 protein. 


76 


30 


1875 


Ki347378 


Homo sapiens 


pl07 


76 


30 


1875 


gll J/o/ i 


urosopniia 
melanogastex 




76 


24 


1876 


ABB17955 


Homo sapiens 


HUMA- Human nervous system related 

nnlvnenfidp ^"PO TD NO 6612 


186 


40 


1876 


AAS17764_ 
aal 


Homo sapiens 


GENA- Human Genomic DNA for 

v^Jx I O JJ L . 


167 


39 


1876 


AAO02331 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 16223. 


165 


42 


1877 


gi|59977|em 
b|CAA78oo 
2.1| 


Human 

endogenous 

retrovirus 


mparuie rusion rranscnpi r \ui\£.i~> 


224 


76 


1878 


ABB84943 


Homo sapiens 


sequence SEQ ID NO:254. 


1056 


93 


1878 


A A Til 1 £-lf\ 


Homo sapiens 


PTJHT AtniriA apiH Qpnupnfp of a 
±xWJ A Milium al/iLl dviJu^Livi; \ji c* 

human protein having a hydrophobic 
domain. 


1056 


93 


1878 




Homo sapiens 


OFHTH PR HI nnlvnentide 


1056 


93 


lo/y 


A DDI CQ<1 


rtomu sapiens 


HTTA/f A- Human nervous svstem related 
polypeptide SEQ ID NO 4518. 


73 


36 


1 OCA 


A ATT811 17 
AAUoji 1 / 


riomo sapiens 


7YMO "Mnvel secreted nrotein 
Z799543G2P. 


66 


54 


laoU 


gllZfZJLfiO 


LaClOCCKX/US 

lactis subsp. 
lactis 


fnilw fYipmhrane linoDrotein precursor 


66 


26 


1881 


gi609624 


Vibrio cholerae 


EpsC 


73 


29 


1 CC7 


gliZ00/*K)0 


PattllC 

XvaUuo 

■n rtTVPtrifi l <i 

WJX VCgJV/UO 


^vnantotapmin Vlld 


86 


32 


1882 


gil2667454 


Rattus 
norvegicus 


synaptotagrnin VTIc 


85 


33 


1 RR7 


/«HAn77 


Pcpi ifmrnVitPQ 
rdCUUUiauica 

virus 


ORF-3 nrotein 

V-/J-VA »J ^/ivivill 


83 


35 


1883 


gil747 


Oryctolagus 
cuniculus 


trichohyalin 


119 


29 


1883 


gi2072290 


Xenopus laevis 


XL-INCENP 


100 


27 


1883 


gil2584554 


Human 

coxsackievirus 

B3 


poiyprotein 


96 


25 


1884 


gi|15601413| 
reflNP 2330 


Vibrio cholerae 


sucrose-6-phosphate dehydrogenase 


65 


55 
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T^rkooW nf inn 

ijcscriy nun 


Score 


% 
Identity 




A A 1 1 

44. lj 










1885 


gUOoYoZof 


JtlOIHO SapiBDo 


Qimilar tn fi-twmfnal modlllHtOr DrOtein 


74 


35 


1885 


©15866714 


Homo sapiens 


C-tenninal modulator protein 


74 


35 


1885 


AAOOo9o4 


Homo sapiens 


uvcn "Rinnan nnlvnentide SEO ID 

NO 20876. 


70 


60 


1887 


AAW25939 


Homo sapiens 


oin Jvo l -ceil receptor v -uci« j . i 
peptide fragment 


601 


99 


1887. 


gi36973 


Homo sapiens 


l -ceil receptor oeia-cuoiii 


601 


99 


1887 


gil552498 


Homo sapiens 


v segment uansiauon piuun^i 


600 ! 


100 


1888 


gil 8874468 


Homo sapiens 


parnnonmgKieiecuve j-iutc jhuicui 

SpilCe VUIlaJul L. ^ 


198 


73 


1888 


gil6903870 


Homo sapiens 


partitioning-defective 3 -bice protein 
splice variant b 


198 


73 


1888 


gil6903868 


Homo sapiens 


partitioning-defective 3-Uke protein 
splice vanani a 


198 


73 


1889 


gi2 1489377 


Homo sapiens 


MAPA protein 


1620 


99 


1889 


gi2 1489330 


Bos taums 


MAPA protein 


833 


56 


1889 


gi21489379 


Mus musculus 


MAPA protein 


630 


48 


1890 


AAY10874 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted protein. 




100 


1890 


gil7429674 


Ralstonia 
solanacearum 


PROBABLE LIPOPROTEIN 


73 


44 


1891 


gil5723141 


Homo sapiens 


c349E10.1.1 (novel protein, isoiorm i j 


1 an 




1891 


AAB59006 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ED 714. 


174 


47 


1891 


gil9353342 


Mus musculus 


RiKbN cujna yDiuuDoJtJUz gene 




*T f 


1892 


AAM86086 


Homo sapiens 


HUMA- Human 

mimune/haematopoietic antigen SEQ 
ID NO: 13679. 


95 


53 


1892 


AAO05973 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19865. j 


94 


82 


1892 


AAO09418 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 23310. 


91 


70 


1893 


gi8778607 


Arabidopsis 

th ali ana 


F5M15.23 


/ 1 




1894 


AAM65951 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26257. 


69 


38 


1894 


AAM53568 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded proteui bfcVJ m 
NO: 25673. 


69 


38 


1894 


gi|20832567| 
refJXP 1335 
24.11 ■ 


Mus musculus 


similar to Heterogeneous nuclear 
nbonucleoprotein Aj (nniuNr as ) 
(DI0S102) 


163 


76 


1895 


AAM66299 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26605. 


440 


83 


1895 


AAM53913 


Homo sapiens. 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ED 
NO: 26018. 


440 


83 


1895 


gi|6723273|d 
bj|BAA8965 
9.11 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


270 


45 
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1896 


gi4883988 


Bartonella 
clarridgeiae 


cell division protein FtsZ 


68 


28 


1897 


AAO13209 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO27101. 


1 /in 

142 


54 t 


1897 


AAM66708 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27014. 


124 


46 


1897 


AAM54310 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protem SEQ ID 
NO: 26415. 


124 


46 


1898 


gi2565268 


Drosophila 
virilis 


pore-forming protein MIP family 


75 


27 


1898 


gi7453547 


Homo sapiens 


glioma tumor suppressor candidate 
region protein 1 


75 


31 


1898 


gB218331 


Metarhizium 
anisopliae 


nitrogen response regulator 


74 


26 


1899 


gi9656609 


Vibrio cholerae 


chemotaxis protein CheA 


73 


32 


1899 


gi|20908537| 
ref|XP_1274 
14.1| 


Mus musculus 


RBCENcDNA 1700001L19 


443 


80 


1899 


gi|15642063| 
ref|NP 2316 
95.1| 


Vibrio cholerae 


chemotaxis protein CheA 


73 


32 


1900 


gi|18586105| 
ref|XP 0914 
00.1] 


Homo sapiens 


similar to seal 


203 


84 


1900 


gi|20888279| 
ref|XP 1465 
08.1| 


Mus musculus 


similar to spinocerebellar ataxia type 1 


199 


82 


1901 


gi338033 


Homo sapiens 


serum protein 


90 


32 


1901 


gi4808221 


Homo sapiens 


dJl 17715,2 (serum constituent protein 
MSE55) 


90 


32 


1901 


gi4098993 


Mus musculus 


polyhomeotic 2 


88 


30 


1902 


AAB19933 


Homo sapiens 


INCY- Human oxidoreductase OXRD- 
8. 


250 


100 


1902 


gil97 13043 


Fusobacterium 
nucleatum subsp. 
nucleatum 
ATCC 25586 


Iron/zmtfcopper-bmding protein 


73 


22 


1902 


gi|20342079| 
refpCP 1106 
14.11 


Mus musculus 


RIKEN cDNA 1700003E16 


77 


25 


1903 


gi342279 


Macaca 
nemestrina 


opiornelanocortin 


231 


49 


1903 


gi28342 


Homo sapiens 


proopiomelanocortin 


230 


49 


1903 


gil90183 


Homo sapiens 


opiornelanocortin 


230 


49 


1904 


gi|ll037ll7| 
gb[AAG274 
85.l|AFl94 
537 l 


Homo sapiens 


NAOli 


loU 




1905 


gi5360984 


Homo sapiens 


dJ228H13.1 (similar to Ribosomal 
protein L21e) 


152 


72 


1905 


AAB44126 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1571. 


150 


83 
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Identity 


1905 


gi550015 


Homo sapiens 


ribosomal protein L21 


150 


83 


1906 


gi2654610 


Pseudomonas 
aeruginosa 


arginine/onu thine succinyltransferase 
Alsubunit 


TO 

79 


o c 


1906 


gil7226812 


Botryotinia 
fiickeliana 


histidine kinase 


72 


oo 

33 


1906 


gil 6904238 


Botryotinia 
fuckeliana 


two-component osmosensing histidine 
kinase BOSlp 


72 


33 


1908 


gi330359 


Human 
herpesvirus 4 


nuclear antigen precursor 


91 


OT 

37 


1908 


gil632793 


Human 
herpesvirus 4 


EBNA3C (EBNA 4B) latent protem 


91 


in 

37 


1908 


gil 184677 


Candida albicans 


hyphal wall protein 1 


90 


o o 

38 


1909 


gil3177635 


Rattus 
norvegicus 


phospholipase C beta-3 


72 


26 ! 


1909 


gil 150880 


Mus musculus 


phospholipase C beta3 


71 


26 


1909 


gil7105044 


Simian 
adenovirus 25 


10.1 kDa 


71 


31 


1910 


gi9857054 


Leishmania 
major 


possible CG7055 protein 


71 


47 


1910 


gil617560 


Leishmania 
major 


LCFACAS5;L5701.2 


67 


33 


1910 


gi|9857054|e 
mb|CAC040 
11.11 


Leishmania 
major 


possible CG7055 protein 


71 


47 


1911 


AAY87278 


Homo sapiens 


INCY- Human signal peptide 
containing protem HSPP-55 SEQ ID 
NO:55. 


501 


82 


1911 


AAB18912 


Homo sapiens 


GETH A novel polypeptide designated 
PRO 1889. 


501 


82 


1911 


AAU27659 


Homo sapiens 


ZYMO Human protein AFP5 1348 1 . 


416 


77 


1912 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase poiyprotein 


434 


80 


1912 


gi|18676710| 
dbj)BAB850 
07.1| 


Homo sapiens 


FLJ00254 protein 


270 


64 


1913 


gi5713196 


Caenorhabditis 
eiegans 


liprin-alpha homolog SYD-2 


479 


38 


1913 


gi930343 


Homo sapiens 


LAR-interacting protein lb 


467 


39 


1913 


gi930341 


Homo sapiens 


LAR-interacting protein la 


467 


39 


1914 


gi6651021 


Mus musculus 


semaphorm cytoplasmic domain- 
associated protein 3B 


274 


63 


1914 


gi6651019 


Mus musculus 


semaphorin cytoplasmic domain- 
associated protein 3A 


274 


63 


1914 


AAM25720 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1235. 


266 


61 


1915 


gi902214 


Zea mays 


RNA polymerase beta* subunit-2 


72 


24 


1915 


gil 2482 


Zea mays 


RNA polymerase beta-2 subunit (AA 

1-1 jZ f ) 


72 


24 


1915 


gi|l 1467184| 
ref|NP 0430 
17.1| 


Zea mays 


RNA polymerase beta* subunit-2 


72 


24 


1916 


gil 655432 


Mus musculus 


plexin2 


1135 


58 


1916 


AAM93435 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3070. 


1132 


57 


1916 


gi961515 


Xenopus laevis 


plexin 


1126 


54 
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ED 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


1917 


gil5559064 


Mus musculus 


SNAG1 


86 


38 


1917 


gi|20863586| 
reflXP 1415 
81.11 " 


Mus musculus 


sumlar to dJ551D2.5 (novel protem) 


88 


30 


1917 


gi|18644890| 
ref|NP 5706 
14.11 


Mus musculus 


sorting nexin associated golgi protein 1 


86 


38 


1918 


gil9528383 


Drosophila 
melanogaster 


RE04404p 


67 


32 


1919 


AAM77461 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 37767. 


189 


79 


1919 


AAM64684 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 36789. 


189 


79 


1919 


gi|17477135| 
ref|XP 0634 
15.1| 


Homo sapiens 


sirmlar to embryonal stem cell specific 
gene 1 


263 


75 


1920 


gi2623757 


Rattus 
norvegicus 


neurabin 


172 


97 


1920 


gi2827450 


Gallus gallus 


KS5 protein 


154 


88 


1920 


gi 1399 1829 


Xenopus laevis 


neurabin 


145 


83 


1923 


gi5532302 


Heterocapsa 
triquetra 


PSH CP47 apoprotein 


75 


29 


1923 


gil881335 


Bacillus subtilis 


SIMILAR TO YQFU, YXKD, YITB 
OFB. SUBTIUS. 


68 


38 


1923 


gi|5532302|g 
b|AAD4470 

1.11 


Heterocapsa 
triquetra 


PSH CP47 apoprotein 


75 


29 


1924 


gi6855429 


Leishmania 
major 


possible mucin 1 precursor 


77 


33 


1924 


gi5832816 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF01694 (Rhomboid family), 
Score=61.7, E-value=5.1e-15, N=l 


74 


34 


1924 


AAB51976 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 48 SEQ ID 
NO: 108. 


72 


38 


1925 


AAB51635 


Homo sapiens 


ROSE/ Human secreted protein 
sequence encoded by gene 16 SEQ ID 
NO:75. 


205 


31 


1925 


AAB47128 


Homo sapiens 


INCY- CDIFF-6, Incyte ID No. 
2009435CD1. 


199 


34 


1925 


ABB55766 


Homo sapiens 


FECH/ Human polypeptide SEQ ID 
NO 138. 


197 


38 


1926 


AAG89279 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 399. 


330 


44 






xiomo sapiens 


otvuiN- n urn an aurxr protein sequence 
SEQIDNO:7. 




AA 


1926 


gil3182757 


Homo sapiens 


HTPAP 


319 


44 


1927 


gil3 177290 


Ectocarpus 
siliculosus virus 


EsV-1-8 


69 


36 


1928 


gi!8700171 


Arabidopsis 
thaliana 


AT5g20480/F7C8J70 


86 


39 


1928 


gi915207 


Sus scrofa 


gastric mucin 


83 


29 
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ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


OA 

Identity 


1928 


gi532113 


Caenorhabditis 
elegans 


homeotic region most like 
HMPB_DROME : homeotic 
proboscipedia protein 


79 


27 


1929 


ABB12295 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEO ED NO:2665. 




jy 


1929 


AAG04080 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 8161. 


7o 


1Q 
•55 


1929 


gi9279807 


Drosophila 
melanogaster 


cortactin 


77 


Z 1 


1930 


AAV81204_ 
aal 


Homo sapiens 


GEHO Human CD7 cDNA. 


of z. 


fj 


1930 


AAB36657 


Homo sapiens 


IMMV Human CD7 protein sequence 
SEQ ID NO:2. 


872 


73 


1930 


AAU02438 


Homo sapiens 


GEHO Human lymphocyte cell surface 
antigen CD7 polypeptide. 


872 


73 


1931 


gi2636248 


Bacillus subtilis 


similar to transaldolase (pentose 
phosphate) 


73 


29 


1931 


gi|21398633| 
ref|NP 6546 
18.1| 


Bacillus 

anthracis A2012 


Transaldolase, Transaldolase [Bacillus 


74 


29 


1931 


gi|16080764| 
refjNP 3915 
92.1| 


Bacillus subtilis 


similar to transaldolase (pentose 
phosphate) 


73 


29 


1932 


AAB43545 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO:990. 


73 


46 


1932 


AAM40234 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3379. 


71 


26 


1934 


gi3129962 


Gallus gallus 


B locus Lectin like Natural Killer cell 
surface protein 


82 


30 


1934 


AAB93791 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDN0:13545. 


77 


38 


1934 


gi2541864 


Drosophila 
melanogaster 


DAD polypeptide 


77 


32 


1935 


gi|4959869|g 

b|AAD3453 

6.1| 


Murine leukemia 
virus 


polymerase 


335 


52 


1935 


gi|6524624|g 
blAAF15098 
.1! 


Phascolarctos 
cinereus 


pol protein 


331 


52 


1935 


gi|9630313[r 
ef|NP 0567 
90.1| 


Gibbon ape 
leukemia virus 


pol polyprotein 


328 


52 


1936 


gi6562332 


Arabidopsis 
thaliana 


diarninopimelate decarboxylase 


OK 

8o 




1936 


gi7573355 


Arabidopsis 
thaliana 


diarninopimelate decarboxylase-like 
protein 


80 




19io 


gll3JL40ZJl) 


Arabidopsis 
thaliana 


AT^ol 1RRft/F14F1fl SO 
j\i Jgi ioou/ri*trio j\j 


86 


30 


1939 


AAU07442 


Homo sapiens 


GETH Human Wntl Upregulated 
protein 2 (WUP2). 


300 


100 


1939 


AAU07441 


Homo sapiens 


GETH Human Wntl Upregulated 
protein 1 (WUP1). 


300 


100 


1939 


AAB56802 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1 380. 


300 


100 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


/© 

Identity 


1940 


gi5802814 


Homo sapiens 


Gag-Pro-Pol-Env protein 


JO/ 


D 1 


1940 


gi4185939 


Human 
endogenous 
retrovirus K 


pol protein 


586 


57 


1940 


gi5802821 


Homo sapiens 


Gag-Pro-Pol protein 


350 


j 1 


1941 


AAU83088 


Homo sapiens 


ZYMO Novel secreted protein 
Z2812G3P. 


DoO 


1 nc\ 

1UU 


1941 


AAB20275 


Homo sapiens 


SCHE Human interleukin DNAX 80. 


535 


76 


1941 


AAB20277 


Homo sapiens 


SCHE Human mterleukm DNAX 80 
variant. 




fo 


1942 


AAM06866 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 
NO: 1074. 


994 


100 ! 


1942 


gil 7426446 


Homo sapiens 


bA35lK23.5 (novel protein) 


933 


54 


1942 


gil 5099951 


Mus musculus 


diacylglycerol acyltransferase 2 


915 


55 i 


1943 


AAM06596 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 
NO: 327. 


406 


98 


1943 


gi|15640499| 
refJNP 2301 
26.1| 


Vibrio cholerae 


S-adenosylraethionine synthase 


67 


51 


1945 


AAG75561 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO: 63 25. 


327 


1 AA 


1945 


gil6416764 


Homo sapiens 


FKSG16 


327 


100 


1945 


Ril3905212 


Mus musculus 


RIKEN cDNA 1200006F02 gene 


261 


79 


1946 


Ri288174 


Mus musculus 


Oct2b 


97 


85 


1946 


gi53490 


Mus musculus 


Oct2.5 transcription factor 


97 


85 


1946 


gi9937478 


Drosophila 
melanogaster 


thyroid hormone receptor-associated 
protein TRAP 170 


72 


39 | 


1947 


AAM66980 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27286. 


170 


69 


1947 


AAM54574 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26679. 


170 


69 


1947 


AAM75189 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35495. 


159 


86 


1948 


AAY10874 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted protein. 


100 


100 


1949 


AAA27155_ 
aal 


Homo sapiens 


GENE- Human P2 DNA. 


100 


1 AA 
1UU 


1949 


AAY94475 


Homo sapiens 


GENE- Predicted translation product of 
human P2 splice isoform, P2-B. 


100 


100 


1949 


AAY94474 


Homo sapiens 


GENE- Human P2 protein. 


1 AA 

100 


1 AA 

1UU 


1950 


gi9502082 


Homo sapiens 


tubby super-family protein 


80 


40 


1950 


gi9502080 


Mus musculus 


tubby super-family protein 


77 


41 


1950 


gi8118432 


Oryza sativa 


beta-expansin 


73 


35 


1 AC 1 

1951 


gi4 808994 


walleye 
epidermal 
hyperplasia virus 
type 1 


envelope polyprotein 


£Q 
\>y 




1951 


gi|15642893| 
refjNP 2279 
34.1| 


Thermotoga 
maritima 


ribonucleotide reductase, In- 
dependent 


66 


46 


1952 


AAB80264 


Homo sapiens 


GETH Human PR0332 protein. 


577 


61 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


1952 


AAB33425 


Homo sapiens 


GETH Human PR0332 protein 
UNQ293 SEQ ID NO:57. 


577 


61 


1952 


AAY13396 


Homo sapiens 


GETH Amino acid sequence of protein 
PR0332. 


577 


ol 


1953 


gil6648392 


Drosophila 
melanogaster 


LD39243p 


449 


61 


1953 


AAG73684 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4448. 


371 


55 


1953 


AAY48312 


Homo sapiens 


META- Human prostate cancer- 
associated protein 9. 


371 


55 


1954 


AAU84348 


Homo sapiens 


BAAK/ Protein MMP2 differentially 
expressed in.breast cancer tissue. 


2068 


94 


1954 


ABB90738 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker oolypeptide SEQ ID NO 208. 


2068 


94 : > 


1954 


AAB84607 


Homo sapiens 


PFIZ Amino acid sequence of matrix 
metalloproteinase gelatinase A. 


2068 


94 


1955 


gil6769680 


Drosophila 
melanogaster 


LD46678p 


245 


35 


1955 


AAM66797 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27103. 


148 


80 


1955 


AAM54396 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26501. 


148 


80 


1957 


AAB80242 


Homo sapiens 


GETH Human PR0236 protein. 


648 


97 


1957 


AAM93378 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 2955. 


648 


97 


1957 


AAB12157 


Homo sapiens 


PROT- Hydrophobic domain protein 
from clone HP03 1 65 isolated from KB 
cells. 


648 


97 


1958 


AAM41696 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6627. 


234 


47 


1958 


AAU17119 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 684. 


229 


46 


1958 


gil6741621 


Homo sapiens 


Similar to RAB37, member of RAS 
oncogene family 


228 


47 


1959 


gil8025526 


cercopithicine 
herpesvirus 15 


LF3 


140 


30 


1959 


gi3153821 


Mus musculus 


plenty-of-prolines-101; POP101; SH3- 
philo-protein 


137 


25 


1959 


gi39255 


Actinomyces 
viscosus 


sialidase 


129 


28 


1960 


ABB12366 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 120. 


400 


90 


1960 


AA012936 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26828. 


115 


95 


1960 


AAM84898 


Homo sapiens 


TTTTXJf A T T- - — . — * — 

HUMA- riurnan 

immune/haematopoietic antigen SEQ 
IDNO:12491. 


1 


OA 


1961 


gil9110438 


Homo sapiens 


polycystin-lLl 


190 


94 


1961 


gi3115393 


Rana pipiens • 


guanylate cyclase inhibitory protein 


80 


35 


1961 


gi3462887 


Rattus 
norvegicus 


alpha-fodrin 


68 


31 


1962 


AAU83130 


Homo sapiens 


ZYMO Novel secreted protein 


1076 


100 
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SEQ 

n> 

NO: 


Accession 
No. 


Species 


Description 


Score 


0/ 

/o 

Identity 








Z835892G6T. 






1962 


gil 890354 


Brass ica napus 


L-ascorbate peroxidase 


OA 


n 
33 


1962 


gi7529611 


Leishmania 
major 


hypoothetical protein L787.06 


79 


31 


1963 


AAG78679 


Homo sapiens 


BODE- Human thrombotic protein 46. 


467 


86 


1963 


AAY87347 


Homo sapiens 


INCY- Human signal peptide 
containing protein HSPP-124 SEQ ID 
NO: 124. 


467 


86 


1963 


AAB01431 


Homo sapiens 


MILL- Human TANGO 224 (form 2). 


467 


86 


1964 


gi3413504 


Rattus 
norvegicus 


Bassoon 


81 


26 


1964 


gi330452 


human 
herpesvirus 5 


DNA polymerase 


79 


28 


1964 


AAV69717__ 
aal 


Homo sapiens 


LUDW- Tumour rejection antigen 
precursor MAGE-C1 cDNA. 


73 


33 


1965 


gi|2323'287|g 

b|AAB6652 

8.1| 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


286 


64 


1965 


gi|2351212|d 
bj|BAA2206 
4.11 


Friend murine 
leukemia virus 


gag-pol polyprotein (precursor protein) 


179 


47 


1965 


gi|9629516|r 
ef]NP 0447 
38.1| 


Rauscher murine 
leukemia virus 


Pol 


179 


47 


1966 


gi|2323287|g 

b|AAB6652 

8.11 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


476 


65 


1966 


gi|2281588|g 

b|AAB6416 

0.1| 


synthetic 
construct 


Pol 


323 


51 


1966 


gi|9626961|r 
ef|NP 0579 
33.1| 


Murine leukemia 
virus 


Prl80 


323 


51 


1967 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


518 


73 


1967 


AAM65715 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26021. 


464 


69 


1967 


AAM53338 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25443. 


464 


69 


1968 


AAG78149 


Homo sapiens 


BODE- Human polypeptide- 
cytochrome b5- 1 3 . 


388 


82 


1968 


gi3150438 


Human 
endogenous 
retrovirus K 


pol-env 


345 


55 


1968 


gi!469243 


Human 
endogenous 
retrovirus K 


pol/env 


345 


55 


1969 


gi21 113108 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


TonB-dependent receptor 


78 


31 
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ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% ! 
Identity 


1969 


£476274 


Homo sapiens 


R kappa B 


n 




1969 


gi4206769 


Acanthamoeba 
castellanii 


myosin I heavy chain kinase 


76 


27 


1970 


gi|13310191| 
gb|AAK181 
89.1[AF331 
500J 


multiple 

sclerosis 

associated 

retrovirus 

element 


recombinant envelope protein 


244 


77 


1970 


gi|8272468|g 
b|AAF74215 
.1|AF15696 
3 1 


Homo sapiens 


envelope protein 


219 


81 


1970 


gi|2 1103962| 
gb|AAM331 
41.1| 


Homo sapiens 


enverin-2 


219 


77 


1971 


AAU83621 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
60. 


320 


100 


1971 


AAO05826 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19718. 


295 


93 


1971 


AAM39560 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2705. 


194 


56 


1972 


£6456112 


Mus musculus 


F-box protein FBX15 


128 


44 


1972 


gi21428946 


Drosophila 
melanogaster 


GH22104p 


74 


31 


1972 


giI6456112|g 
b|AAF09139 
.11 


Mus musculus 


F-box protein FBX1 5 


128 


44 


1973 


£148270 


Escherichia cob 


lambda-integrase 


550 _j 


94 


1973 


gil790244 


Escherichia coli 
K12 


site-specific recombinase, acts on cer 
sequence of ColEl, effects 
chromosome segregation at cell 
division 


550 


94 


1973 


gil3364217 


Escherichia coli 
0157:H7 


site-specific recombinase XerC 


544 


ni 


1974 


gil805552 


Escherichia coli 


FORMATE HYDROGENLYASE 
TRANSCRIPTIONAL ACTIVATOR. 


887 


OQ 

oo 


1974 


£1616960 


Escherichia coli 


HyfR 


887 


88 


1974 


gi7920396 


Salmonella 
typhimurium 


formate hydrogenlyase activator 
protein 


522 


54 


1975 


£409795 


Escherichia coli 


No definition line found 


1175 


99 


1975 


£15074592 


Sinorhizobium 
meliloti 


HYPOTHETICAL 
TRANSMEMBRANE PROTEIN 


378 


33 


1975 


£17740718 


Agrobacterium 
tumefaciens str. 
C58 (LL 
Washington) 


Na+/Pi-cotransporter 


372 


34 


1976 




Homo sapiens 


IfZATf - WnmnTi rria<;t surface 

antigen. 


163 


23 


1976 


£12654783 


Homo sapiens 


Similar to loss of heterozygosity, 11, 
chromosomal re£on 2, gene A 


163 


23 


1976 


AAZ45690_ 
aal 


Homo sapiens 


REGC cDNA sequence encoding the 
human minor vault protein p 1 93. 


108 


25 


1977 


ABB56523 


Homo sapiens 


MERI Human NMDA receptor subunit 
SEQ ID NO 44. 


73 


28 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


o/ 
/o 

lueniuy 


1977 


AAW87504 


Homo sapiens 


SD3I- Human N-methyl-D-aspartate 
receptor subunit encoded by clone 
NMDA24. 1 


Id 




1978 


AAG00471 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4552. 


285 


93 


1978 


gi298489 


Papio hamadryas 


SP-10 




34 


1978 


gi452582 


Vulpes vulpes 


fox sperm acrosomal protein FS A- 
Acr.l 


132 


34 


1979 


AAB87128 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349, SEQ H> NO: 130. 


4VU 


or: 
OO 


1979 


AAB87179 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 12 IK variant, SEQ ID 
NO:227. 


488 


85 


1979 


AAB87181 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 E41D variant, SEQ ED 
NO:231. 


487 


85 


1982 


AAM75035 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
IDNO: 35341. 


109 


67 


1982 


AAM62231 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 34336. 


109 


67 


1982 


gil 1967423 


Mus musculus 


vomeronasal receptor V1RC5 


105 


76 


1983 


AAG89276 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 396. 


224 


46 


1983 


AAB56565 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO:l 143. 


99 


40 


1983 


AAY44987 


Homo sapiens 


INCY- Human epidermal protein-4. 


78 


28 


1984 


AAB95089 


Homo sapiens 


HELI- Human protein sequence SEQ 
1DNO:17025. 


498 


97 


1984 


AAM06608 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 
NO: 339. 


495 


96 


1984 


gi497890 


unidentified 
nitrogen- fixing 
bacteria 


alpha subunit of dinitrogenase 
reductase (Fe protein) 


73 


24 


1985 


gi|17455728| 
refpCP 0635 
94.11 


Homo sapiens 


_ . : 

similar to Zinc-finger protein ubi-d4 

(Requiem) (Apoptosis response zinc 

finger protein) 


/l 


3 / 


1986 


gi21428886 


Drosophila 
melanogaster 


GH12469p 


oy 


1A 
34 


1987 


gi7767529 


Bos taurus 


cyclophilin I 


364 


75 


1987 


gi8699209 


Canis familiaris 


cyclophilin A 


361 


oo 


1987 


gill641132 


Sus scrota 


cyclojjhilin 


361 


88 


1988 


gil5073168 


Sinorhizobium 
meliloti 


PROBABLE TRANSLATION 
INITIATION FACTOR IF-2 
PROTEIN 


O 1 

81 


3/ 


1988 


ion 

gill81352 


Paramecium 
burs aria 

Chlorella virus 1 


xTO-n.cn protein, riro {oa.) 


78 


25 


1988 


gi493242 


Feline 

herpesvirus 1 


Feline herpesvirus type 1 irnmediate 
early protein 


77 


20 


1989 


AAM65707 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
IDNO: 26013. 


134 


66 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 

Tdpnfitv 


1989 


AAM53330 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoaeu protein orLi^ iu 
NO: 25435. 


134 


66 


1989 


gi)20475216| 
refpOP 1148 
02.1| 


Homo sapiens 


similar to synapsin I 


228 


59 


1990 


AAM71181 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEl^ 
ID NO: 31487. 


110 


64 


1990 


AAM58674 


Homo sapiens 


MOLE- Human brain expressed single 

v — j _ j *- _ * — pun ■ ■ i 

exon probe encoded protem ocxl iu 
NO: 30779. 


110 


64 


1990 


gi21323636 


Corynebacterium 
glutamicum 
ATCC 13032 


Sulfate permease and related 
transporters (MFS superfamily) 


75 


26 


1991 


gil932813 


Xenopus laevis 


dsRNA adenosine deaminase 


96 


34 


1991 


AAE10203 


Homo sapiens 


HYSE- Human bone marrow derived 
conng protem, SEQ ID NO: 68. 


83 


25 


1991 


Ki3242649 


Rana catesbeiana 


alpha 1 type I collagen 


80 


30 


1992 


gil 181423 


Paramecium 
bursaria 

Chlorella virus 1 


PBCV-1 chitinase 


71 


41 


1992 


gi|21300897| 

gb|EAA130 

42.1| 


Anopheles 
gambiae str. 
PEST 


agCP 14405 


72 


37 


1992 


gi|9631828|r 
efINP_0486 
13.1| 


Paramecium 
bursaria 

Chlorella virus 1 


PBCV-1 chitinase 


71 


41 


1994 


gi8248755 


Plasmodium 
falciparum 3D7 


protem phosphatase 


72 


25 


1994 


gi4104348 


Campylobacter 
rectus 


S-layer-RTX protein 


70 


38 


1994 


gi|8248755|e 
mb|CAB628 
78.2| 


Plasmodium 
falciparum 3D7 


protem phosphatase 


72 


25 


1995 


gi21324402 


Corynebacterium 
glutamicum 
ATCC 13032 


Uncharacterized ATPase related to the 
helicase subunit of the Holhday 
junction resolvase 


73 


38 


1995 


gi|19552845| 
ref|NP_6008 
47.1| 


Corynebacterium 
glutamicum 


COG2256:Uncharacterized ATPase 
related to the nencase subumt or the 
Ho lliday junction resolvase 


73 


38 


1995 


gi|17533213| 
ref]NP_4957 
77.1| 


Caenorhabditis 
elegans 


F14E5.5.p 


73 


30 


1996 


gil871223 


Rickettsia typhi 


crystalline surface layer protein 


92 


30 


1996 


gi6969926 


Rickettsia 

n p»cc1i1i'mJinn'i{ 
d-CoUiiitii jq nun 


OmpB 


79 


25 


1996 


gil4670347 


Rickettsia felis 


OmpB 


78 


25 


1997 


gi|20548733| 
refJXP 0556 
4U| 


Homo sapiens 


similar to gag protein 


256 


58 


1997 


gi|9739120|g 
b|AAF97916 
.11 


Bovine leukemia 
virus 


gag 


186 


34 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


0/ 

/o 

Identity 


1997 


gi|9626226|r 
efJNP 0568 
97.1] 


Bovine leukemia 
virus 


Pr44 


185 


34 


1998 


AAM79834 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3480. 


279 


71 


1998 


AAM78850 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1512. 


279 


71 


1998 


AAM79204 


Homo sapiens 


HYSE- Human protein SEQ ED NO 
1866. 


272 


/I 


1999 


AAM73176 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 33482. 


168 


48 


1999 


AAM60521 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 32626. 


1 /TO 

loo 




1999 


gi|139291481 
ref|NP 1139 
97.1| 


Rattus 
norvegicus 


cyclic nucleotide-gated channel beta 
subunit 1 


1 63 


47 


2000 


gil869859 


human 
herpesvirus 2 


very large tegument protein 


73 


30 


2000 


gi7380253 


Neisseria 

meningitidis 

Z2491 


2-keto-4-hydroxyglutarate aldolase 


70 


37 


2000 


gi7226633 


Neisseria 

meningitidis 

MC58 


4-hydroxy-2-oxoglutarate aldolase/2- 

deydro-3-deoxyphosphogluconate 

aldolase 


70 


1*7 

37 


2001 


gil7016969 


Mus museums 


NUANCE 


1 iff 


i& 


2001 


gi6273778 


Homo sapiens 


trabeculin- alpha 


\1H 

I J I 


n 

JO i 


2001 


gil675222 


Mus museums 


ACF7 neural isoform 1 


136 


42 


2002 


AAM39256 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2401. 


81 


29 


2002 


©840789 


Homo sapiens 


binding regulatory factor 


81 


29 


2002 


gil7028337 


Homo sapiens 


regulatory factor X, 5 (influences HLA 
class II expression) 


O 1 

81 




2003 


gi2252814 


Mus museums 


FOG 


1 /2 


04 


2003 


AAR58815 


Homo sapiens 


USSH Human c-myc far upstream 
element (FUSE) binding protein 
(FBP)variant from HL60 clone 3-1. 


lui 




2003 


gi3598974 


Rattus 
norvegicus 


protein tyrosine phosphatase TD14 


103 


*>/C 

20 


2004 


gil 1994696 


Arabidopsis 
thaliana 


contains similarity to DNA repair 
protein~gene id:K7M2.11 


77 


Zo 


2004 


gi7209527 


Mus musculus 


testis-specific gene 


73 




2004 


gi|17451912| 
refpCP 0710 
83.1| 


Homo sapiens 


similar to DNA-binding protein B 


234 


97 


2003 


A A CI inn 


nomo sapiens 


INCY- Human G-protein coupled 
receptor, GCREC-2. 


173 


100 


2005 


AAG65832 


Homo sapiens 


FARB Human G protein-coupled 
receptor (GPCR). 


173 


100 


2005 


AAG68126 


Homo sapiens 


FARB Human 7TM-GPCR protein 
sequence SEQ ID NO:6. 


105 


78 


2006 


gi20068811 


Homo sapiens 


Rab-coupling protein 


130 


43 


2006 


gil5822596 


Homo sapiens 


nRipll 


104 


45 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
luenury 


2006 


gil3377897 


Homo sapiens 


Rabl 1 interacting protein Rip 11a 






2007 


gi|17539708| 
ref|NP 5014 
89.1| 


Caenorhabditis 
elegans 


F08B4.5.p 


78 


42 


2008 


AAE1O350 


Homo sapiens 


PFIZ Human ADAMTS-J1 .4 variant 
protein. 


j04 


y t 


2008 


AAE10349 


Homo sapiens 


PFIZ Human ADAMTS-J1.3 variant 
protein. 


CAvl 
J 1)4 


yf 


2008 


AAE10347 


Homo sapiens 


PFIZ Human ADAMTS-J1 . 1 variant 
protein. 


504 


y / 


2009 


AAV31720_ 
aal 


Homo sapiens 


MOUN Nucleotide sequence oi the 
PUR-alpha gene. 


0/ 




2009 


AAT99264__ 
aal 


Homo sapiens 


MOUN Human PUR-alpha gene. 


C7 


Ly 


2009 


AAQ44800_ 
aal 


Homo sapiens 


MOUN Encodes single-stranded DNA 
binding (PUR) protein. 


87 


29 


2010 


gil70444 


Lycopersicon 
esculentum 


extensin (class E) 


123 


27 


2010 


gi4662641 


Arabidopsis 
th a liana 


expressed protein 


116 


30 


2010 


gil88864 


Homo sapiens 


mucin 


115 


28 


2011 


AAY93650 


Homo sapiens 


HUMA- Amino acid sequence of a 
human prostacyclm-stimulating factor- 
2. 


1677 


100 


2011 


AAS15723_ 
aal 


Homo sapiens 


CURA- DNA encoding insulin-like 
growth factor family related protein, 
NOV3. 


1673 


99 


2011 


AAE17599 


Homo sapiens 


INCY- Human extracellular messenger 
(XMES)-l protein. 


1673 


on 

99 


2012 


gil0440434 


Homo sapiens 


FLJ00052 protein 


336 


oy 


2012 


gi20502870 


Mus musculus 


SDS3 


333 


68 


2012 


gi21430678 


Drosophila 
melanogaster 


RE74901p 


170 ; 


36 


2013 


AAH77293_ 
aal 


Homo sapiens 


MILL- Human ion channel protein 
IC32391 cDNA coding region. 


214 


93 


2013 


AAE13278 


Homo sapiens 


INCY- Human transporters and ion 

v ^1 _ /rpn T/~TT\ £ 

channels (TRICH)-5. 


214 


93 


2013 


AAG77969 


Homo sapiens 


MILL- Human ion channel protein 
IC32391. 


214 


93 


2014 


gi4894768 


Xenopus laevis 


ephrin-B2 precursor 


78 


in 
ou 


2015 


AAU77498 


Homo sapiens 


INCY- Human lipid metabolism 
enzyme, LMM-6. 


1291 


100 


2015 


ABB08205 


Homo sapiens 


INCY- Human lipid metabolism 
enzyme-5 (LME-5). 


1122 


1 f\C\ 

100 


2015 


ABB07493 


Homo sapiens 


INCY- Human lipid metabolism 
molecule (LMM) polypeptide (ID: 


864 


75 


2016 


gi|14769015| 
refjXP 0415 
69.1| 


Homo sapiens 


fibrillin3 


68 


36 


2017 


gi23 13786 


Helicobacter 
pylori 26695 


chorismate synthase (aroC) 


78 


33 


2017 


gi4155160 


Helicobacter 
pylori J99 


CHORISMATE SYNTHASE 


72 


32 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 

THpntitv 


2017 


gi|15645287| 
reflNP 2074 
57.1| ~ 


Helicobacter 
pylori 26695 


chorismate synthase (aroC) 


to 


DO ! 


2018 


gil5485622 


Homo sapiens 


Q9H4T4 like 


J uoo 


ion 


2018 


ABB 14744 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 3401. 


694 


98 


2018 


AAB95100 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17064. 


i n i 
1U1 




2019 


K i8050556 


Gorilla gorilla 


carboxvl-ester lipase 


223 


42 i 


2019 


AAU09894 


Homo sapiens 


MONS Bile Salt Stimulated Lipase 
(BSSL). 


217 


39 


2019 


ABB04676 


Homo sapiens 


MONS Human milk bile salt- 
stimulated lipase (BSSL) protein SEQ 
IDNO:2. 


217 


39 ; 


2020 


gi2065210 


Mus mus cuius 


Pro-Pol-dUTPase polyprotein 


515 


74 ! 


2020 


gi|385615|gb 
|AAB26708. 
11 


Mus sp. 


fibulin gene homolog 


300 


75 


2020 


gi|13194728| 
gb|AAK155 
26.1|AF329 
451 1 


Gallus gaUus 


pol-like protein ENS-3 


170 


33 


2021 


AAM66980 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27286. 


170 


75 ! 


2021 


AAM54574 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26679. 


170 


75 


2021 


AAM75189 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35495. 


159 


86 


2022 


AAD29146_ 
aal 


Homo sapiens 


ZYMO Human Zcyto21 consensus 
cDNA. 


649 


83 


2022 


AAU83208 


Homo sapiens 


ZYMO Novel secreted protein 
Z908463G2P. 


649 


83 


2022 


AAE18311 


Homo sapiens 


ZYMO Human Zcyto21 consensus 
j>rotein. 


649 


83 


2024 


gil4336750 


Homo sapiens 


Ce protein similar to Dm Cys3His 
finger protein 


OA 

84 


34 


2024 


AAB50363 


Homo sapiens 


UYSL- Human SRCAP. 


83 


34 


2024 


AAB95541 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 18 149. 


83 


34 


2025 


gil8676682 


Homo sapiens 


FLJ00240 protein 


470 


45 


2025 


gil4701866 


Dictyostelium 
discoideum 


carmil 


III 


70 

Ay 


2025 


gil881738 


Acanthamoeba 
casteuanii 


myosin-I binding protein Acanl25 


219 


29 


2026 


ABB 12490 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


212 


78 


2027 


AAU83147 


Homo sapiens 


ZYMO Novel secreted protein 
Z846363G2P. 


1153 


100 


2027 


gi!21287755| 

gbpAAOOO 

76.1| 


Anopheles 
gambiae str. 
PEST 


ebiP4780 


205 


51 
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SEQ 

n> 

NO: 


Accession 
No. 


Species 


Description 


Score 


% ! 

/o 

Identity 


2027 


gi|17552028| 
ref]NP 4984 
07.1| 


Caenorhabditis 
elegans 


C05D11.8.p 


91 


38 


2028 


gil510143 


Homo sapiens 


similar to C. elegans protein encoded in 
cosmidT20D3 (Z68220). 




57 


2028 


gi3879942 


Caenorhabditis 
elegans 


T20D3.11 




97 


2028 


gi5869818 


Globodera 
pallida 


NADH-ubiquinone oxidoreductase 
subunit 6 


82 


27 


2029 


AAE13288 


Homo sapiens 


INCY- Human transporters and ion 
channels (TRlCH)-l 5 . 


75 


31 


2029 


gB252893 


Thermotoga 
neapolitana 


ABC transporter 


74 


37 


2029 


gi|18403965| 
reflNP 5658 
26.1| 


Arabidopsis 
tfaaliana 


expressed protein 


f\j 




2030 


AAB97908 


Homo sapiens 


SHAN- Human OTP-binding protein 
17SEQIDNO:2. 




97 
LI 


2030 


AAM42129 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 7060. 


79 


27 


2030 


gi9971156 


Mus musculus 


GXP-binding like protein 2 


79 


27 


2031 


gi|20864803| 
reflXP 1308 
00.1| 


Mus musculus 


RIKEN cDNA 4930503K02 


89 


25 


2031 


gi|21262152| 
emb|CAD32 
690.1] 


Oryza sativa 


SMC4 protein 


77 


28 


2031 


gi|1507705|g 

b|AAB0656 

8.11 


Borrelia 
burgdorferi 


outer surface protein 


74 


33 


2032 


AAG65898 


Homo sapiens 


SMUC Amino acid sequence of GSK 
gene Id 18525. 


481 


100 


2032 


AAU83670 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
158. 


471 


97 


2032 


ABB84896 


Homo sapiens 


GETH Human PRO1309 protein 
sequence SEQ H) NO: 160. 


471 


97 


2034 


gi6723273 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


687 


43 


2034 


gil8448744 


Moloney murine 
leukemia virus 


Prl80 gag-pro-pol polyprotein 


685 


42 


2034 


gi2801471 


Moloney murine 
leukemia virus 


Prl80 


682 




2035 


gi|17554696| 
ref]NP 4976 
70.1| 


Caenorhabditis 
elegans 


R148.7.p 


68 


32 


2035 


gijioiz /yyo| 
reflNP 4145 
43.1| 


Escherichia coli 
K12 


•aeriartfklrinaCP 1 hfimO^PriTie 
do pal lUfUllaac A, uwiuva^i tuw 

dehydrogenase I 


68 


43 


2035 


gi| 19548975] 
gb|AAL908 
85.1|AF487 
900 1 


Escherichia coli 


aspartokinase I-homoserine 
dehydrogenase I 


68 


43 


2036 


gi!3424459 


Caulobacter 


methyl-accepting chemotaxis protein 


72 


32 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description I 


Score 


% 
Identity 






crescentus CB15 


Mcpl 






2036 


gi|16877133| 
gb[AAH168 
38.1|AAH16 
838 


Homo sapiens 


carboxypeptidase, vitellogenic-like 


69 


30 


2037 


AAB67055 


Homo sapiens 


INCY- Human immune response 
molecule (JMUN) protein SEQ ID NO: 
9. 




75 


2037 


AAO01862 


Homo sapiens 


HYSE- Human polypeptide btLKi iu 
NO 15754. 




u / 


2037 


gi|6753924|r 
ef|NP 0343 
74.1| 


Mus muse ul us 


Friend virus susceptibility 1 


240 


39 


2039 


AAB38447 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 20 clone 
HUFBY15. 


SO 


27 


2039 


gill 527799 


Mus rnusculus 


GTP-binding protein like 1 ] 


73 


30 


2039 


gi695237 


Equine 
herpesvirus 2 


tegument protein 


ID 




2040 


gi|20544038| 
ref|XP 0896 
12.4) 


Homo sapiens 


similar to P ER-HEXAMER REPEAT 
PROTEIN 5 


68 


41 


2042 


AAM77922 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protem SEQ 
ID NO: 38228. 


642 


85 


2042 


AAM65219 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein bbvj ID 
NO: 37324. 




OJ 


2042 


gi|6723273|d 
bj|BAA8965 
9.11 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


139 


26 


2043 


gi48507 


Wolinella 
succinogenes 


formate dehydrogenase 


80 


27 


2043 


Kil2381857 


Danio rerio 


c-Maf 


78 


42 


2043 


gi|18594822| 
repP_0929 
95.1| 


Homo sapiens 


zinc finger protem 21 (KUA 14; 






2044 


gi3132272 


Sus scrofa 


WT1 homologue 


99 


47 


2044 


AAG78446 


Homo sapiens 


MASI Predicted WT1 Wilm's tumour 
polypeptide of humans. 


96 


45 


2044 


AAG62154 


Homo sapiens 


CORI- Human W 1 l/roA tusion 
protem SbQ 1JJ JNU. i. 


yo 


45 ! 


2046 


gi21483222 


Drosophila 
melanogaster 


AT16994p 


86 


33 


2046 


gi21111736 


Xanthomonas 
campestris pv. 

rflTrmpotnQ <itr 

ATCC 33913 


cell division protein 


79 


30 


2046 


gil2653493 


Homo sapiens 


Similar to brain acid-soluble protein 1 


79 


36 


2047 


ABB12490 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


200 


83 


2047 


gi|20837783| 
refJXP 1459 
21.1| 


Mus rnusculus 


similar to 40S ribosomal protein S 1 1 


73 


35 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


2047 


gi|6002932|g 
b|AAF00209 
.1|AF16496 
0 5 


Streptomyces 
fradiae 


glycosyl transferase 


71 


35 


2048 


AAB59012 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ID 720. 


103 


32 


2048 


gi2429362 


Santalum album 


proline rich protein 


99 


31 


2048 


gil7945382 


Drosophila 
melanogaster 


RE17165p 


98 


25 


2051 


gi 15625542 


Hepatitis B virus 


S antigen 


71 


31 


2051 


gi|4884886|g 
b|AAD3185 
7.1|AF1341 
40 1 


Hepatitis B virus 


surface antigen 


68 


30 


2052 


AAB28764 


Homo sapiens 


HUMA- Sequence homologous to 
proteni fragment encoded by gene 2 1 . 


693 


78 


2052 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


693 


78 


2052 


AAB73606 


Homo sapiens 


SHAN- Human dUTP pyrophosphatase 
26. 


668 


77 


2053 


gi9945983 


Pseudomonas 
aeruginosa 


transcriptional regulator PcaQ 


83 


34 


2053 


gil 3874427 


Homo sapiens 


cerebral protein-5 


76 


35 


2053 


gi 12803205 


Homo sapiens 


CAAX box l 


76 


35 


2054 


gi21307831 


Aplysia 
califomica 


CREB-binding protein 


76 


26 


2054 


gil6755887 


Drosophila 
melanogaster 


guanine nucleotide exchange factor 


76 


26 


2054 


gi|21307831| 

gb|AAL548 

59.1| 


Aplysia 
caltfornica 


CREB-binding protein 


76 


26 


2055 


gil6588389 


Homo sapiens 


B lymphocyte activation-related protein 
BC-1514 


437 


71 


2055 


AAB92981 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:ll698. 


407 


68 


2055 


AAM48325 


Homo sapiens 


SHAN- Human purine receptor 21.23. 


398 


74 J 


2056 


gi|2072969lg 

b|AAC5127 

4.1| 


Homo sapiens 


p40 


134 


47 


2056 


gi|7959889|g 
b|AAF71115 
.1|AF11672 
1 95 


Homo sapiens 


PR02221 


123 


43 


2056 


gi|2072974|g 

b|AAC5127 

7.1| 


Homo sapiens 


p40 


122 


44 


2057 


gil9l/l 178 


Homo sapiens 


metal loprotease disintegrin 16 with 
thrombospondin type I motif 


CIO 




2057 


gil9171150 


Homo sapiens 


ADAMTS18 protein 


168 


35 


2057 


AAM39212 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2357. 


128 


76 


2058 


gi|4959869|g 
b)AAD3453 


Murine leukemia 
virus 


polymerase 


336 


50 
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SEQ 

m 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 


2058 


gi|9630313|r 
eflNP 0567 
90.1| 


Gibbon ape 
leukemia virus 


pol polyprotein 


331 


46 


2058 


gi|6723273|d 
bj|BAA8965 
9.11 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


329 


49 


2059 


gil20546404| 
reffXP 1164 
66.11 


Homo sapiens 


similar to nuclear receptor coactivator 
4; RET-activating gene ELE1 


179 


91 | 


2060 


gi|6731237|g 
b|AAF27177 
.1|AF18231 
7 1 


Homo sapiens 


myoferlin 


112 


79 : 


2060 


gi|798799|gb 
|AAC37713. 

11 


Mus muscuhis 


: — 

immunoglobulin heavy chain 


79 




2060 


gi|208 19487| 
refjXP 1453 
57.1| 


Mus musculus 


similar to LYRIC 


79 


77 


2061 


gi415738 


Euglena gracilis 


PSII Dl -polypeptide 




97 


2061 


Rill491 


Euglena gracilis 


32 kd protein 


75 


27 


2061 


gill488 


Euglena gracilis 


32-Kda thylakoid membrane protein 




97 
J. 1 


2062 


gi21360549 


Arabidopsis 
fhaliana 


AT3g01480/F4P13_3 


79 


29 


2062 


gi3337366 


Arabidopsis 
thaliana 


nodulin-like protein 


68 


36 


2063 


gi7959778 


Homo sapiens 


PR01546 


121 


42 


2063 


AAG02639 


Homo sapiens 


GEST Human secreted protein, SEQ ED 
NO: 6720. 






2063 


AAG02753 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6834. 


1 1 A 

1 1U 




2064 


gil 5077406 


Antheraea 
yamamai 


fibroin 


i no 


in 


2064 


AAB82806 


Homo sapiens 


BOST- Human low density lipoprotein 
binding protein 2 (LBP-2). 


Q9 


94 


2064 


AAO01059 


Homo sapiens 


HYSE- Human poJypepuae o&Kl us 
NO 14951. 


QO 


10 

JU 


2065 


gi200964 


Mus musculus 


serine 2 ultra high sulfur protein 


Ou 


10 


2065 


gi200962 


Mus musculus 


serine 1 ultra high sulfur protein 


ou 


10 


2065 


AAM99918 


Homo sapiens 


HUMA- Human polypeptide JShQ ID 
NO 34. 


7<? 


9R 


2066 


gi544724 


Cavia 


cholecystokinin A receptor; CCk-A 
receptor 




90 


2066 


gi2541920 


Rattus 
norvegicus 


cholecystokinin type- A receptor 


\>y 


29 


2066 


gi2114152 


Mus musculus 


cholecystokinin type- A receptor 


69 


29 




CT19R9RSR6 


Pnnoo evemaeus 


BRCA1 


73 


22 


2068 


AAM40813 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5744. 


75 


29 


2068 


AAM39027 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2172. 


75 


29 


2068 


AAY25768 


Homo sapiens 


HUMA- Human secreted protein 
encoded from gene 58. 


75 


29 


2070 


giB34150 


Mus musculus 


unidentified reading frame (first ATG 


169 


28 
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% 
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atpos. 210) 






2070 


gi557822 


Saccharomyces 
cerevisiae 


mal5, stal, len: 1367, CAI: 0.3, 

AMYHYEAST P08640 

GLUCO AMYLASE SI (EC 3.2. 1.3 J 


133 


1\J 


2070 


gil304387 


Saccharomyces 
cerevisiae var. 
diastaticus 


ghico amylase 


133 


20 


2071 


gil7983056 


Brucella 
melitensis 


BETA-HEXOSAMINIDASE A 


88 


29 


2071 


gil573917 


Haemophilus 
influenzae Rd 


; 

multidrug resistance protein A (emrA) 


81 


n 

33 


2071 


gil7982813 


Brucella 
melitensis 


NITROUEN KhuULA 1 ION 
PROTEIN NTRB 


QA 
5U 


zo 


2073 


gi|17532255| 
ref|NP_4964 
31.1| 


Caenorhabditis 
elegans 


ankyrin and proline rich domains 


0/ 


zy 


2074 


gil9919730 


Homo sapiens 


BTEB5 


704 


97 


2074 


gil3 195441 


Homo sapiens 


BTE-binding protein 4 


A1Q 

47o 


04 


2074 


fi ii4549656 


Mus musculus 


dopamine receptor regulating factor 


452 


76 


2076 


AAE17482 


Homo sapiens 


ZYMO Human leucine-rich repeat-7 
(ZLRR7) protein. 


1326 


100 


2076 


AAU83190 


Homo sapiens 


ZYMO Novel secreted protein 
Z887300G2P. 


1326 


100 


2076 


ABB11242 


Homo sapiens 


HYSE- Human SLIT-2 homologue, 
SEQ ID NO: 1612. 


568 


99 


2077 


gil8893729 


Pyrococcus 
furiosus DSM 
3638 


protease iv 


74 


34 


2077 


AAB94745 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 15792. 


71 


34 


2077 


gil 6413096 


Listeria innocua 


Iin0656 


68 


35 


2078 


gi60675 


Beet ringspot 
virus 


polyprotein 


75 


37 


2078 


gi|14743288| 
reflXP 0471 
91.1| 


Homo sapiens 


shmlar to Alu subfamily J sequence 
contamination warning entry 


92 


58 


2078 


gi|20260801| 
ref|NP 6201 
13.1| 


Beet ringspot 
virus 


polyprotein 


75 


37 


2079 


gi3834629 


Mus musculus 


diaphanous-related formm; pi 34 
mDia2 


ZUo 


0/ 


2079 


AAG74400 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ED NO:5164. 


*71 


ia 


2079 


gi3171906 


Homo sapiens 


DIA-156 protem 


71 
/ 1 




2080 


pil7298315 


Homo sapiens 


candidate tumor suppressor protein 


125 


100 


2080 


gi7861733 


Homo sapiens 


low density lipoprotein receptor related 
nmtein-deleted in tumor 


Mo 


inn 


2080 


gi8926243 


Mus musculus 


low density lipoprotein receptor related 
protein LRP1B/LRP-DIT 


90 


63 


2081 


gi4574224 


Fundulus 
heteroclitus 


multidrug resistance transporter 
homolog 


343 


55 


2081 


gil6304396 


Pseudopleuronec 
tes americanus 


multidrug resistance transporter-like 
protein 


340 


52 


2081 


^3355757 


G alius gallus 


ABC transporter protein 


328 


53 
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Score 


% 
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2082 


gi7532975 


bacteriophage 
phi-8 


P10 


67 


27 
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NO: 
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entry ID 


Description 


♦Results 


1059 


BL00349 


CTF/NF-I proteins. 


BL00349H 15.70 9.7 10e-09 8-45 ! 


1061 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.143e-10 29-61 
DM00215 19 43 8 322e-09 40-72 


1062 


DM01354 


kw TRANSCRIPTASE REVERSE O 
ORF2. 


DM01354U 12.24 6.092e-12 80-99 


1063 


PR00944 


COPPER ION BINDING PROTEIN 

OTi~»XT A TT TOT? 

MUNAl UKb 


PR00944E9.18 7.132e-09 33-46 


1076 


PD00078 


T-> T7T1T7 A T* T»T> /"YTTTTXT A XTLT 

NUCLEAR ANKYR. 


PD00078B 13 14 9 217e-09 23-35 


1089 


PROO308 


TYPE I ANTIFREEZE PROTEIN 

{-IT/— »XT A TT TTi HT 

SIGNATURE 


PR00308C3.83 8.754e-10 16-25 


1089 


PR00456 


RIBOSOMAL PROTEIN P2 

Cl/^XT A TT TDT7 


PR00456E3.069.658e-09 16-30 


1089 


PR00341 


PRION PROTEIN SIGNATURE 


PR00341E 3.32 9.898e-09 24-43 


1099 


PR00886 


HIGH MOBILITY GROUP 
(HMG1/HMG2) PROTEIN 
SIGNATURE 


PR00886C 11.84 1. 14 le- 12 28-46 


1107 


PR00833 


T\r\T T TTNT ATT T7T> /~ , T?XT Tit~\ A "DT 

POLLEN ALI .HKGEN rUA rl 

fTOXr A TT TO "D 

SIGNA1 URb 


PPOftRHW J 10 3 077e-09 51-65 


1118 


BL00472 


Small cytokines 
(mtexcrine/chernokine) C-C 
subfamily signatur. 


RT 00472A 7 45 5 655e-09 1-12 


1118 


PR00655 


AUXIN BINDING PROTEIN 
SIGNATURE 


PR00655E 8.06 9.000e-09 88-103 


1119 


BL00970 


Nuclear transition protein 2 proteins. 


BL00970C 14.80 8.183e-12 99-136 


1119 


BL0O826 


MARCKS family proteins. 




1119 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 5.881e-10 93-135 
tit r\/Y*ilflT? O'* 1Q £ 01-133 


1119 


PD01457 


RIBOSOMAL PROTEIN 40S /£UNO- 

1? rvTr 1 T?T"> 1\ A C~P A \ 

FINGER Mfc 1 AL. 


prim 4*7 a ifi^l 8 ?1fip-0Q 73-1 17 


1119 


BL00752 


Ax A protein. 


BL00752B 19.17 7.866e-09 100-143 
BL00752B 19.17 8.979e-09 63-106 


1119 


DM01269 


303 kW ACI1VA11XNU KAJN 

GTPASE ISOZYME. 


r>M01?6QA ?3 35 9 446e-09 109-136 


1124 


DM01813 


EGG-LAY IN U raUKJYLUJNi:. 


nMmsi3A 1 S 31 5 215e-09 15-42 


1127 


BL00452 


Guanylate cyclases proteins. 


BL00452A 17.52 1.170e-09 6-27 


1131 


BL00113 


Adenylate kinase proteins. 


BL00113B 20.49 9.897e-09 157-200 


1162 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


pnmw;fi i o A3 7 onn<»-35 94-62 


1163 


BL0O407 


Connexins proteins. 


m nnztn7P. id ?3 o 77^p-^0 21-51 
BL00407C 14.61 2.500e-24 52-79 


1163 


PR00206 


CONNEX1N SICjN A 1 UKc 


PRnn?n/\P» 1^ 7S 1 9S7e-24 33-55 
PR00206A 11.35 6.559e-23 2-26 
PRnrv?06r 15 16 7 469e-20 58-78 


1171 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER Mlil AJ^iJJJNl/lJNVj JNU. 


PD01066 19.43 8.500e-28 35-73 


1177 


DM01803 


1 HERPESVIRUS 
GLYCOPROTEIN H. 


DM01803C 7.00 7.240e-09 46-55 


1190 


PR00774 


GUANYLIN PRECURSOR 
SIGNATURE 


PPft0774A fs 4Q R S7Qp-10 69-81 


1195 


PD02059 


CORE POLYPROTEIN PROTEIN 
GAG CONTAINS: P. 


PD02059C 21.58 8.031e-09 100-140 


1197 


BL00472 


Small cytokines 
(intercrine/chemokine) C-C 
subfamily signatur. 


BL00472A 7.45 8.000e-14 1-12 


1213 


PR00437 


SMALL CXC CYTOKINE 


PR00437C 14.85 1.310e-16 33-51 
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FAMILY SIGNATURE 




1213 


BL00471 


Small cytokines 
(mtercrme/chemokine) C-x-C 
subfamily signat. 


TkT nn/ni m no n Ci£f\a 1ft £ 

BL0U4/1 15. y 2. /.youe-iU OOj 


1216 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308C 3.83 5.208e-09 183-192 


1222 


PF00852 


Fucosyl transferase. 


PF00852F 15.97 1.409e-15 195-231 


1224 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 6.301e-l 1 4/-y& 


1230 


PR00540 


MUSCARINIC M3 RECEPTOR 
SIGNATURE 


PR00540A 10.24 7.174e-09 134-153 


1240 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 7.480e-10 160-182 
BL00290B 13.17 2.875e-09 226-243 


1258 


PR00792 


PEPSIN (Al) ASPARTIC 
PROTEASE FAMILY SIGNATURE 


PR00792A 11.54 5.500e-18 80-100 


1258 


BL00141 


Eukaryotic and viral aspartyl 
proteases proteins. 


BL00141A 12.10 4.789e-15 87-102 

T%T AA1 AITi 11 1/1 O OOOfl 1 A IOC lOQ 

BLOOM IB 12.14 z.yzye-iu zzo-z^y 


1300 


BL00616 


Histidine acid phosphatases 
phosphohistidine proteins. 


BL00616A 11.86 L000e-09 136-143 


1301 


DM01417 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417C 12.93 9.325e-12 361-372 
DM01417D 11.08 9.820e-12 400-415 


1302 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 6.067e-ll 324-338 


1311 


BL00926 


Lysyl oxidase copper-binding region 
proteins. 


BL00926B 13.84 7.453e-09 84-121 


1320 


PR00830 


ENDOPEPTIDASE LA (LON) 
SERINE PROTEASE (SI 6) 
SIGNATURE 


PR00830A 8.41 3.712e-09 29-48 


1325 


BL00048 


Protamine PI proteins. 


BL00048 6.39 4.67le-10 58-84 
BL00048 6.39 4.908e-10 60-86 
BL00048 6.39 2.913e-09 59-85 
BL00048 6.39 5.950e-09 57-83 


1345 


PF00424 


REV protein (anti-repression 
transactivator protein). 


PF00424A 14.34 2.436e-09 184-215 


1345 


BL00048 


Protamine PI proteins. 


BL00048 6.39 4.553e-10 178-204 
BL00048 6.39 6.513e-09 179-205 


1353 


DM01354 


kw TRANSCRIPTASE REVERSE H 
ORF2. 


DM01354U 12.24 2.857e-15 82-101 


1363 


PF00850 


Histone deacetylase family. 


PF00850B 10.13 5.154e-14 95-109 
PF00850C 14.55 9.063e-li 132- 148 


1389 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 6.423e-09 50-64 


1389 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306B 5.57 /.UOUe-uy Dy-Ov 


1396 


BL00427 


Disintegrins proteins. 


BL00427 13.93 7.698e-17 260-314 


1396 


PR00289 


DISINTEGRIN SIGNATURE 


nnnnoon a n c*t c can™. \A 1*7/1 OQi 

PR00289A 13.62 5.6o7e-14 z/4-zyj 


1416 


BL00419 


Photosystem I psaA and psaB 
proteins. 


BL00419B 22.23 9.489e-09 18-51 


1434 


PF00075 


RNase H. 


PF000751 16.21 7.375e-ll 167-173 i 


1440 


BL00598 


Chromo domain proteins. 


f->T A AC nO 1 A A C 1 CAA& If 1 1*) \'X'X 

BL00598 14.45 l._>UUe-l.> HZ-ioJ 


1 AAC\ 

144U 


JrKUUDU4 




PR00504B 9 12 5 200e-13 106-120 
PR00504C 11.19 6.510e-09 121-133 


1450 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 2.227e-09 93-114 


1451 


PD02935 


FATTY ACID 

OXIDOREDUCTASE BIOSYNT. 


PD02935C 16.62 4.375e-16 59-86 


1467 


BL00479 


Phorbol esters / diacylglycerol 


BL00479A 19.86 3.000e-ll 130-152 
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JvcSUllh 






binding domain proteins. 




1468 


PF00992 


Troponin. 


PF00992A 16.67 5.563e-10 139-173 


1468 


BL00795 


Involucrin proteins. 


r»i nmncr 1 1*7 f\/Z 1 (Trtrt- AO 1Q1 91*7 


1468 


PR00042 


FOS TRANSFORMING PROTEIN 
SIGNATURE 


PR00042D 8.97 7.554e-09 141-162 


1474 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 9.308e-12 62-92 \ 


1474 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


FKUOlUytJ IL.2.1 1 .DCoe-l/V 0Z-5U 


1474 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BLUOzjyc lo./j 4.zUje-uy 4y-/ 1 


1475 


BL00456 


Sodium: solute symporter family 
proteins. 


oL0U4 jOU 24.D3 4.oooe-Zo ID-Oy 


1480 


BL00983 


Ly-6 / u-PAR domain proteins. 


BL00983C 12.69 1.346e-09 36-51 


1482 


BL00979 


G-protein coupled receptors family 3 
proteins. 


BL00979A 19.66 9.633e-12 74-121 


1502 


PD02561 


DETHIOBIOTIN SYNTHETASE 
SYNTHASE. 


PD02561B 12.71 9.308e-09 176-182 


1506 


BL00297 


Heat shock hsp70 proteins family 
proteins. 


BL00297H 15.46 9.625e-23 302-355 
BL00297D 11.95 6.063e-21 166-205 
BL00297E 18.56 o.or/e-zi ZZO-Zoy 
BL00297C9.51 9.o67e-15 1U>1jO 


1506 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR003011 12.76 3.208e-ll 320-336 


1513 


PR00130 


DNASE I SIGNATURE 


PR00130E 14.66 5.046e-09 237-zoo 


1515 


DM01242 


3 THREONINE— TRNA LIGASE. 


DM01242A 20.32 5.286e-20 163-206 


1517 


BL00983 


Ly-6 / u-PAR domain proteins. 


BL00983B 8.19 5.935e-10 40^9 


1520 


BL00415 


Synapsins proteins. 


BL00415P 2.37 3.914e-10 138-173 


1520 


PR00049 


WTLM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.746e-09 124-138 
PRO0049D0.O0 1.000e-08 123-137 


1530 


PF00075 


RNase H. 


PF00075F 12.87 5.500e-10 127-137 


1537 


PR00463 


E-CLASS P450 GROUP I 
SIGNATURE 


PR00463F 17.63 5.219e-13 288-306 
PR00463A 11.40 8.714e-12 0Z-/1 
PR00463B 17.50 5.041e-10 76-97 


1537 


PR00385 


P450 SUPERFAMILY 
SIGNATURE 


PR00385C 16.94 o.318e-09 2o9-3UU 


1538 


PR00709 


AVIDIN SIGNATURE 


PR00709A 4.60 5.585e-09 19-37 


1553 


DM01354 


kw TRANSCRIPTASE REVERSE II 
ORF2. 


DM01354Y 10.69 6.423e-16 113-152 


1558 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 70-108 


1564 


PF00589 


Phage integrase family. 


PF00589B 16.17 1.621e-ll 158-171 
PF00589C 14.62 9.609e-l0 183-194 


1566 


BL00908 


Mandelate racemase / muconate 
lactonizing enzyme family signa. 


BL00908B 37.71 6.455e-l3 191-245 


1567 


PR00702 


ACRIFLAVIN RESISTANCE 
PROTEIN FAMILY SIGNATURE 


PR00702A 14.92 2.421e-25 8-32 
PR00702B 12.77 9.690e-18 36-54 


1570 


BL01047 


Heavy-metal-associated domain 
proteins. 


BL01047A 13.50 5.125e-17 75-97 


1575 


DM01354 


i ttt> a VCPD TDTA CE T>T?\f"C"DCC TT 

kw TRANSCKir 1 Aote. KbvrlKoG U 
ORF2. 




1606 


PF00642 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00642 11.59 2.575e-ll 197-207 


1610 


DM01354 


kw TRANSCRIPTASE REVERSE n 
ORF2. 


DM013541 15.55 7.702e-34 348-388 
DM01354G 11.57.3,625e-32 277-307 
DM01354H 18.00 2.528e-23 308-347 



Printed from Mimosa 05/1 1/28 15:53:48 Page: 222 



WO 03/080795 PCT/US02/25485 

222 



Table 3 



SEQID 
INU: 


Database 
entry ID 


Description 


♦Results 








TYfcAAl KAT? 1 A W 4 HfiSa 1 1 0/1 1 07£ 

1/M.U1J54.T 14.50 4.Uooe-l 1 Z41-Z/0 


lolo 




PRECURSOR I. 


"DTlAOGOOA 00. 97 0 161* 0^ ^0 

Jt\Ul/ZyzyA Zo.Z/ Z.zOjC-Zj jZ-o5 


1 /CO*7 

loz7 


FKUUlzl 


o Ann nv^r/DAT a gotttk/t 
bVDlVMJrU 1 AdoJ-UM- 

TRANSPORTTNG ATPASE 

SIGNATURE 


T»T> AA1 0 1 A /C T1 1 AAAa AQ 1 5. 00 

JrxCUUlzlA o./i i.uuue-uo 15-zy 


loJO 


PRU0o24 


TJTCTJ A TTP T TT» A C"C cm\T A TT TT> T? 

HEP A 1 1C LIPASE SIGNA 1 URb 


PKU0o24A /.ol /.zl4e-zZ o-Z4 


1640 


BL00359 


Ribosomal protein LI 1 proteins. 


BL00359C 22.18 1.155e-ll 93-126 


1641 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 8.839e-10 134-145 


1641 


PR00081 


GLUCOSE/RIB ITOL 
DEHYDROGENASE FAMILY 

OT/^XT A TT TT>T3 

MGNA 1 UKJti 


PR00081A 10.53 2.000e-12 45-62 
PR00081E 17.54 1.783e-10 238-255 

TVD ArtAO ID 1 A lO T TTT«. AO 1 1 A "1 A Z 

PKUUUolr> 1U.38 Z.2Z/e-09 134-145 


1641 


BL00061 


Short-chain 

dehydrogenases/reductases family 
proteins. 


BL00061A 9.41 9.053e-10 134-144 

T>T AAA^ID OC TO H 0<A„ Art 1 AT T*2y1 

uluuuoId Z5./y o.&oue-uy iy/-zj4 


1000 


r>T ai tct 


Ribosomal protein LlOe proteins. 


T>T A10<,OT\ 1 Q QA 0 A"71*» 1 ^ C Q OQ 

dIJjiZj lu lo.oU z.y/oe-15 5y-y<s 


1667 


TiT A1 1/4 1 

bLU1241 


Link domain proteins. 


nr AITyll If Q1 O C7fl« n 1 OA OT) 

r>LUlz41 .35. ol o.5/9e-3/ IoV-ZJ2 
BL01241 35.81 7.835e-14 289-341 


1667 


BL00086 


Cytochrome P450 cysteine heme- 
iron ligand proteins. 


BL00086 20.87 3.377e-09 283-314 


1668 


PR00671 


INHEBIN BETA B CHAIN 
SIGNATURE 


PR00671A 8.36 8.088e-09 4-22 


1672 


TIT f\f\£11 A 

BL00674 


AAA-protein family proteins. 


BL00674E 15.24 5.680e-15 31-50 


1682 


PF00075 


RNaseH. 


PF00075A 14.44 4.400e-13 73-89 

r»T7AAATC/^ 1 1 CO O A Z^vri 1 1 <T*> 

PF00075C 11.58 8.442e-09 152-163 


1689 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.471e-27 268-306 


1689 


PR00788 


NITROPHORIN SIGNATURE 


PR00788A 9.79 6.108e-09 3-15 


1692 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 4.759e-10 32-83 


1697 


PR00423 


CELL DIVISION PROTEIN FTSZ 
SIGNATURE 


PR00423E 7.36 4.038e-09 20-41 


1706 


BL00795 


Involucrin proteins. 


BL00795C 17.06 5.395e-10 185-229 


1709 


BL00514 


Fibrinogen beta and gamma chains 
C-terminal domain proteins. 


BL00514C 17.41 3.618e-25 68-104 
BL00514H 14.95 6.745e- 16 230-254 
BL00514G 15.98 6.566e-14 198-227 
BL00514E 14.28 8.286e-14 128-144 
BL00514D 15.35 2.915e-12 109-121 


1714 


PF00878 


Cation-independent mannose-6- 
phosphate receptor repeat proteins. 


PF00878T 17.51 3.8 18e-09 41-67 


1715 


nun i i a r\ 


Matrix protein (MA), p 1 5 . 


PF01140D 15.54 4.872e-09 123-157 


1/13 


Pr 00992 


Troponin. 


TJT7A AA AT A 1 C C~l C A C 1 « 1A1 A A 1/11 

PF00992A 16.67 6.45 le- 10 109-143 
PF00992A 16.67 3.724e-09 98-132 
±*ruwyyzA lo.o/ o.oc>4e-uy yo-i^u 


1718 


PD02474 


SYNTHASE SMALL SUBUNIT 

a /""Tyrwr a fT 
ACblULAUI. 


PD02474B 21.08 7.940e-10 92-130 


1725 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412B 10.60 1.000e-10 46-82 


1 TTC 

1725 


T>"D a at 1 C 

PK00215 


XTT7T TO ATfXTVT TT TXT PTO\T A TT TO TJ 

NEUROMODULIN SIGNATURE 


TJT> r»A"> 1 CO 1 *5 AO £11 C~ 1 A C A *7A 

PR00215C 13.9S o.lloe-10 54-74 


1725 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 3.160e-09 119-150 


1725 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 8.564e-09 303-335 


1727 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 7.750e-21 185-215 


1727 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 7.176e-12 185-203 



Printed from Mimosa 05/1 1/28 15:53:49 Page: 223 



WO 03/080795 PCT/CS02/25485 

223 



Table 3 



SEQED 
NO: 
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entry ID 


Description 


♦Results 


1727 


BL00239 


Receptor tyrosine kinase class 11 
proteins. 


tjt AAOOQP K |<4 lR7ft-ftO 1 1 0-1 ftfi 


1728 


BL00415 


Synapsins proteins. 


t>t AAA1 1 C 1 1 ^a_AQ ^9 87 
dLUI/40v£ z.Z3 o.llje-uy jZ-6/ 


1734 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


DnAI 07AU OO 1 R < ^£7*» 1 fi 7*\ 111 
rUUlz/UU Zz.lo D.DO/e-lo /D-llL 

PD01270C 19.54 1.167e-17 118-146 
Dnnio7nA 17 99 /i Ofiflp_id 91.60, 

r\J\JiLI\jJ\ 1 I .LI. H-.youc-ln Zl-Uv 
pnni ?7nn 94 4 9R4e-flQ 1 87 


1736 


PD02346 


T»TT/^Trvr\7C , TUN /r tt dt> r\ i ' ltnt 
PHU 1 UoYb 1±1M 11 .rKU 1 iillM 

PRECURSOR PHOTOSYNTHESIS. 


PTifi9XifiA 0 94 8 8S1e-09 6-17 


1741 


BL00415 


Synapsins proteins. 


■rt AA4 1 SO 9 9 1 (\ 777p_0Q ^ 1 7-^ S? 


1744 


BL00479 


Phorbol esters / diacylglycerol 
binding domain proteins. 


dt A/vi7Qia 1 9 ^7 1 fiflft p-OR ^3-48 


1750 


PR00763 


COAGULIN SIGNATURE 


PR00763B 8.39 6.457e-09 41-60 


1754 


PR00276 


INSULIN A CHAIN SIGNATURE 


TVDAAOOfCA 1 1 QA 7 QAClt* AO 21/^-5^ 
rKUUz/oA 11. 84 /.o4Ue-uy 400.> 


1755 


PR00O42 


FOS TRANSFORMING FKU'l E1N 
SIGNATURE 


rR0UU4zL> o.y/ Z.jtwe-uy 104-ioj 


1755 


PF00922 


Vesiculovirus phosphoprotein. 


typaaaooa io n ^ 7cn Q rvn QQ I'lo 

rrUU92zA iy.i / 3. oye-uy yy-ijz 


1778 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 9.836e-14 59-80 

nnnm a c/-> t Ovl 1 c^Aa 1 "5 OIO 0 **0 

PK0024jL- /.o4 l.j4Ue-lJ ZJ/-Z3Z 
PR00245B 10.38 2.125e-13 176-190 


1778 


BL00237 


G-protein coupled receptors proteins. 


BLO0237A 27.6o 1.474e-lZ yu-lzy 


1778 


PR00534 


MELANOCORTIN RECEPTOR 
FAMILY SIGNATURE 


PRO0534A 11.49 4.729e-09 51-63 


1778 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 1 1.48 3.613e-09 26-50 
PR00237C 15.69 7.525e-0y 104-120 


1787 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 5.114e-15 146-165 
PR00007A 19.33 7.052e-10 119-145 


1787 


PR00524 


CHOLECYSTOKININ TYPE A 
RECEPTOR SIGNATURE 


PR00524F 5.36 4.351e-09 70-83 


1787 


DM00250 


kw ANNEXIN ANTIGEN 
PROLINE TUMOR. 


DM00250B 13.84 6.595e-09 82-105 


1787 


BL00415 


Synapsins proteins. 


BL00415N 4.29 7.372e-09 62-105 


1787 


BLOl 113 


Clq domain proteins. 


BL01113B 18.26 3.786e-23 125-160 
BL01113A 17.99 7.968e-l5 73-99 
BL01113A 17.99 5.091e-14 70-96 
BL01113A 17.99 5.295e-l 1 64-90 
BL01113A 17.99 8.568e-ll 79-105 
BL01113A 17.99 8.977e-ll 67-93 
BLOl 113 A 17.99 4.635e-09 82-108 
BL01113A 17.99 6.192e-09 76-102 
BLOl 113A 17.99 7.750e-09 61-87 


1787 


BL00420 


Speract receptor repeat proteins 
domain proteins. 


BL00420A 20.42 8.691e-ll 73-101 
BL00420A 20.42 y.o/Je-1 1 /U-yo 

r>T Art/I OA A OA /IO O 1 BAa 1 A RO. 
DlA)U4ZUA ZU.4Z Z.Iove-lU jj-Oj 

BL00420A 20.42 8.062e-09 52-80 


1789 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


rv\ jTai nine i c yl i O QAA** il /I? CO 

UMOlviUb ij.4i z.yo4e-jj 4D-oy 


1795 


DM01688 


2 POLY-IG RECEPTOR. 


r»n/f Ai ££QT 1 /t 07 7 ^IBAa 1 A 1 A7 1 *?4 

DM01688J 14.69 4.455e-09 60-96 




PT?fifiA7<J 
rruuu/j 




PF00075J 15.78 4.115e-13 115-132 


1802 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 4.130e-ll 86-98 ! 


1802 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BLOO028 16.07 1.600e-10 110-126 
BL00028 16.07 6.100e-10 70-86 


1802 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 9.438&-10 83-92 
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'Results 


1812 


PD00078 


REPEAT PROTEIN ANK 

VTT Ti^tT CAT* A XTT>*"VrT> 

NUCLEAR ANKYR. 


PD00078B 13.14 4.130e-09 157-169 


1824 


PF00628 


PHD-finger. 


PF00628 15.84 5.500e-13 78-92 


1833 


PF00075 


RNase H. 




1833 


PR00939 


C2HC-TYPE ZINC-FINGER 
SIGNATURE 


PR00939A 8.95 3.045e-09 137-146 


1842 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 3.192e-09 244-258 


1844 


BL00972 


Ubiquitin carboxyi-terminal 
hydrolases family 2 proteins. 


t>t nno'TOT^ 99 i i4ft#o i 16R.1Q9 


1857 


T*TV\/1 A *\ A 

PF00424 


REV protein (anti-repression 
transactivator protein). 


PF00494A 14 14 ORSe-09 71-102 


1860 


PR00221 


pirrr TK /fY"YVTT> T TO! PHAT PPOXPTM 

(JAUL-1MU V IKUo tUA 1 rx\\J 1 niiN 

SIGNATURE 


PPO0991H 19 8? 9 410e-09 184-197 

1 


1864 • 


T»T /\1 TOO 

BL01282 


BIR repeat proteins. 


RT 019R9R 10 4Q 1 116&-10 214-252 


1866 


BLO0155 


Cutinase, serine proteins. 


BL00155D 26.87 5.337e-09 19-67 


1895 


PF00075 


RNase H. 




1911 


BLO0983 


Ly-6 / u-PAR domain proteins. 


BL00983C 12.69 6.365e-09 101-116 


1911 


BLO0272 


Snake toxins proteins. 


BLUUz/zC o.z/ l.UUue-Uo 1U3-110 


1925 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308A 5.90 6.795e-ll 64-78 
PR0U3UoC i.oi Z.3o!)e-lU 0/-/0 


1925 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.438e-10 :> /-/l 


1925 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 6.654e-09 59-73 


1930 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DMOul/9 13.9/ !>.zo3e-IU lu/-lio 


1935 


PF00075 


RNase H. 


PF00075J 15.78 2.309e-12 81-98 


1940 


PF00075 


RNase H. 


PF0Q075r 12. 0/ i.o04e-Uy /4-o4 


1952 


PR00019 1 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 3.250e-10 184-197 

TiTi nnmn a i i i n c acsia. AO i Q"7 onft 

PKOUOiyA 1 1.1 y 3.00/ e-uy io/-zuu 


1954 


BL00546 


Matrixins cysteine switch. 


BL00546A 19.62 8.105e-30 77-106 


1954 


BL00023 


Type II fibronectin collagen-binding 
domain proteins. 


BL00023 24.31 4.682e-35 340-376 
BL00023 24.31 2.969e-28 282-3 lo 
BL00023 24.31 9.526e-24 224-260 


1954 


PR00138 


MATRDON SIGNATURE 


PR00138B 15.82 5.500e-18 144-159 
PR00138A 15.14 8.773e-16 97-110 


1954 


BL00024 


Hemopexin domain proteins. 


BL00024B 21.53 9.591e-33 118-151 
BL00024A 1 1.49 2.800e-13 97-107 
BL00024C 22.98 7.796e-l 1 164-212 


1954 


PR00013 


FIBRONECTIN TYPE H REPEAT 
SIGNATURE 

* 


PR00013C 12.29 1.000e-20 372-387 
PR00013C 12.29 3.571e-15 314-329 
PR00013C 12.29 7.800e-14 256-271 
PR00013A 12.26 5.500e-13 344-353 
PR00013B 14.75 1.237e-ll 355-367 
PR00013B 14.75 4.(X)(Je-uy zy/-juy 
PR00013A 12.26 5.333e-09 286-295 

1iT> AAA 1 1 A 19 9£ *7 Ollo OQ 99C 917 

PKUUU13A LZ.ZO / .oiie-uy ZZo-Z J / 


1957 


BL01182 


Glycosyl hydrolases family 35 


BL01182A 21.39 3.357e-34 77-119 


1957 


PR00742 


GLYCOSYL HYDROLASE 
FAMILY 35 SIGNATURE 


PR00742B 15.52 2.653e-14 78-96 
PR00742A 13.75 6.914e-10 57-74 


1958 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 8.200e-15 214-235 


1964 


PR00727 


BACTERIAL LEADER 
PEPTIDASE 1 (S26) FAMILY 


PR00727A 12.93 7.000e-09 9-25 
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^jKesuiis 






O T/^XT A T7 TD 17 

MUIN A 1 UKb 




1965 


PF00075 


RNase H. 


PF00075D 10.71 7.188e-09 71-81 


1966 


PF00075 


RNaseH. 


PF00075C 11.58 9.786e-ll 110-121 
rrOUU/D-b IZ.jO l.o/oe-lU /0-00 


1968 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.082e-l 1 314-347 


1970 


PF00075 


RNase H. 


PF00075J 15.78 8.571e-10 335-352 


1973 


PF00589 


Phage integrase family. 


PF00589B 16.17 1.450e-14 101-114 


1974 


BL00675 


Sigma-54 interaction domain 
proteins ATP-binding region A 
proteins. 


BL00675B 24.07 1.000e-24 118-172 
BL00675C 13.51 6.400e-24 183-210 
BL00675D 12.03 1. 75 Oe-09 245-254 


1987 


PR0O153 


CYCLOPHILIN PEPTEDYL- 
PROLYL CIS-TRANS 
ISOMERASE SIGNATURE 


PR00153B 11.57 1.500e-17 52-64 

t\t* r\e\ i co A to AO A ICC™. 1 A 11 ~) O 

PR00153A 12.98 4.255e-10 23-38 


1987 


BL00170 


Cyclophilin-type peptidyl-prolyl cis- 
trans isomerase signatur. 


BLOul /UB 2U.y/ o.^jUeoJ 4/-00 
BL00170A 17.08 2.309e-09 17-43 


1998 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PDQIOoo 19.43 7.7M)e-3/ Z/-CO | 
PD01066 19.43 8.863e-ll 68-106 


1999 


PF00992 


Troponin. 


TVT7AAAAO A 1 C £H 1 A 0*7 « fifl 1 AO "\ A1 

PF00992A 16.0 / J.45/e-U9 1U5-14Z 


1999 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 7.055e-09 96-148 


1999 


BL00422 


Granins proteins. 


BL00422C 16.18 8.059e-09 117-144 


2001 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019B 13.34 7.158e-14 261-283 


2001 


DM01354 


kw TRANSCRIPTASE REVERSE H 
ORF2. 


DM01354U 12.24 3.500e-13 345-364 


2008 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 3.483e-16 63-90 


2011 


BL00282 


Kazal serine protease inhibitors 
family proteins. 


BL00282 16.88 6.577e-10 127-149 


2011 


BL00222 


Insulin-like growth factor binding 
proteins. 


BL00222B 11.09 6.940e-10 74-89 


2011 


BL00621 


Tissue factor proteins. 


BL00621A 8.69 6.473e-09 5-22 


2012 


PD02563 


PROTEIN NONSTRUCTURAL C 
VP18. 


PD02563C 13.51 9.634e-10 74-128 


2013 


PR00124 


ATP SYNTHASE C SUBUNIT 
SIGNATURE 


PR00124A 8.81 5.655e-09 58-77 


2013 


PR00783 


MAJOR INTRINSIC PROTEIN 
FAMILY SIGNATURE 


PR00783C 13.54 8.981e-09 48-67 


2034 


PF00075 


RNase H. 


PF00075F 12.87 6.523e-09 183-193 


2037 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9.327e-09 115-155 


2048 


PR00671 


INHIBIN BETA B CHAIN 
SIGNATURE 


PR00671B 4.29 8.767e-l0 138-157 


2052 


PD02455 


ELEMENT TRANSPOSABLE 
INSERTION PROTEIN 
TRANSPOSITION DNA. 


PD02455C 29.23 5.230e-09 225-276 


2058 


PF00075 


RNaseH. 


PF00075J 15.78 9.000e-10 81-98 


2074 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 4.000e-13 62-74 


2074 


nn AAA/ o 

PR00048 


SIGNATURE 


r KAJUvJHOD 0.1/ Z *t.*fO<&C-l 1 jy-oo 

PR00048B 6.02 1.000e-10 89-98 
PR00048A 10.52 9.609e-10 101-1 14 


2074 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 9.100e-13 104-120 
BL00028 16.07 1.000e-08 46-62 


2076 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 1.900e-ll 106-H9 
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SEQ 
ID 

NO: 


Pram Model 


Description 


E-value 


Score 


rvo: of 

Pfam 

Domains 


Position 
of the 
Domain 


1050 


FAA_hydrolase 


Fumarylacetoacetate (FAA) hydrolase 
fam 


0.64 


-89.1 


1 

— 


ZZ-14J 


1066 


rubredoxin 


Rubredoxin 


7.2 


iii 
-li.i 






1076 


ank 


Ankyrin repeat 


0.01 


22.5 


i 


25-57 


1076 


sodfe_C 


Iron/manganese superoxide dismutases, 
C-term 


3.9 


-67.9 





Jo-lz4 


1076 


DUF232 


Putative transcriptional regulator 


8.1 


-29.1 


J 


1 1A O^/t 


1099 


HMG box 


HMG (high mobility group) box 


8 


-22.4 


J 


1*7 At 


1109 


UPAR LY6 


u-PAR/Ly-6 domain 


0.21 


-6.2 


J 


1A IT) 

34-1 Iz 


1110 


ldlrecepta 


Low-density lipoprotein receptor 
domain 


8.8e-07 


36.0 




iyo-z4u 


1110 


CUB 


CUB domain 


0.38 


-27.8 


i 


52-161 


1118 


rvt 


Reverse transcriptase 


0.95 


-40.I 


~ 


35-ZU f 


1125 


adenylatekinase 


Adenylate kinase 


0.00037 


-77.6 




111 (YX 
X 3-lU.s 


1162 


KRAB 


KRAB box 


l.le-23 


92.1 


i 


22-62 


1163 


connexin 


Connexin 


3.1e-23 


90.6 


— 


1 1 1 A 

1-130 


1171 


KRAB 


KRAB box 


6.6e-22 


86.2 




33-73 


1193 


MHC_I 


Class I Histocompatibility antigen, 
domains 


2e-06 


1.1 




29-205 


1209 


DOMON 


DOMON domain 


1.9e-12 


54.8 


i 


102-215 


1213 


IL8 


Small cytokines (mtecTine/chernokirie), 
inter 


0.59 


-7.8 


1 


18-55 


1218 


cys rich FGFR 


Cysteine rich repeat 


4.4 


-11.0 


! 


28-76 


1222 


Glyco transf 10 


Glycosyltransferase family 10 


6.6e-06 


-54.1 




1-322 


1240 


ig 


Immunoglobulin domain 


1.6e-06 


35.1 


2 


41- 

124:156- 
230 


1258 


asp 


Eukaryotic aspartyl protease 


8e-06 


-110.8 


1 


19-241 


1280 


DOMON 


DOMON domain 


8.9 


-16.6 


1 


35-117 


1288 


PDZ 


PDZ domain (Also known as DHR or 
GLGF) 


1.1 


0.4 


1 


7-73 


1301 


Exonuclease 


Exonuclease 


3.4e-33 


123.7 


1 


322-479 


1311 


Gemini_mov 


Geminivirus putative movement 
protein 


5.7 


^0.5 


1 


15-79 


1341 


rn3 


Fibronectin type HE domain 


6.6e-36 


132.7 


2 


109- 

200:212- 
301 


1345 


Collagen 


Collagen triple helix repeat (20 copies) 


7.3 


-65.8 


1 


IOC 1 A 1 
153-Z43 


1365 


Amidase 


Amidase 


0.017 


-178.9 




oo-Z/o 


1375 


Galactosyl T 


Galactosyltransferase 


7.1e-44 


159.2 


i 


113-309 


1375 


Glyco transf 25 


Glycosyltransferase family 25 


3 


-77.1 


1 


146-293 


1381 


GRAM 


GRAM domain 


6.6e-14 


59.6 


1 


65-116 


1396 


Pep_M12B_prop 
ep 


Reprolysin family propeptide 


1.4e-27 


105.1 


1 


75-191 


1396 


disintegrin 


Disintegrin 


2.6e-10 


47.7 




Z43-318 


1398 


SKchannel 


Calcium-activated SK potassium 
channel 


1.8e-06 


34.9 


i 


1-57 


1413 




Immunoglobulin domain 


5.4 


9.1 


1 


on off 


1416 


dUTPase 


dUTPase 


0.00044 


9.6 




Hl-237 


1420 


Folate rec 


Folate receptor family 


1.7 


-111.2 




14-175 


1434 


lectin c 


Lectin C-type domain 


1.5e-05 


28.0 




233-319 


1440 


chromo 


'chromo' (CHRromatin Organization 
MOdifier) 


4.6e-ll 


50.2 




92-133 


1449 


PMSR 


Peptide methionine sulfoxide reductase 


0.0089 


-65.8 




4-79 


1450 


SPRY 


SPRY domain 


9e-26 


99.0 




109-240 
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SEQ 
ID 

NO: 


Pf am Model 


Description 


E-value 


Score 


No; of 

Pfam 

Domains 


Position 
of the 
Domain 


1451 


MaoCdenydrata 
s 


-— — — : — — 

MaoC like domain 


2.1e-15 


64.6 


1 


31-152 


1463 


Nlr transi I 




2.6e-12 


54.3 


j 


121-234 


1467 


DAG_rr>bina 


rnorDoi esiers/aiacyigiyceroi ouiaing 
dom 


8.7e-05 


27.4 


j 


130-180 


1467 




l/\_.i oomain 


0.66 


11.2 




141-172 


1470 




jmjC domain 


0.46 


-18.2 


j 


166-262 


1474 


pkin&se 


Protein kinase domain 


0.0019 


-85.7 


j 


2-187 


1475 


SSF 


Sodium: solute symporter family 




-177.1 




1-311 


1478 


dUTPase 


au lipase 


7.6 


-37.5 




2-98 


1479 


nU 




l.le-19 


78.9 


j 


14-100 


1485 


rnaseH 


RNaseH 


0.36 


-28.0 


1 


59-175 


1488 


NTR 


N 1 K/i_o4DL/ module 


f) 044 


-6.1 




293-398 


1506 


HSP70 


Hsp/0 protein 


1 fip-17 


38.3 




61-424 


1517 


UPAR LY6 


u-PAR/Ly-6 domain 


o 77 


-8.2 




44-106 


1530 


rnaseH 


RNaseH 


0.011 


-11.7 




64-155 


1537 


p450 


Cytochrome P450 


O 1 

Z, 1 


-I / o.u 


— 


31-316 


1537 


DNAJigase_OB 


NAD-dependent DNA ligase UB-tola 
domain 


y.z 


-49 0 




200-256 


1558 


KRAB 


KRAB box 


1.8e-18 


74.8 




68-108 


1564 


Phage integrase 


Phage integrase family 


1.2e-09 


45.5 


: 


39-204 


1566 


MRMLE 


Mandelate racemase / muconate 
lactomzing en 


0.00079 


-24.5 




153-352 


1570 


HMA 


Heavy-metal-associated domain 


6.6e-13 


56.3 




71-131 


1580 




Iminunoglobulin domain 


0.99 


15.2 




23-131 1 


1601 


WD40 


WD domain, G-beta repeat 


2e-08 


41.5 


3 


39- 

7j:o3- 

118:126- 

162 


1606 


zf-CCCH 


Zinc ringer C-x8-C-x5-C-x3-H type 


0.094 


19.3 


3 


1 AC 

19G«1 AI 
IZir. 141- 

177-1 87- 

209 


1612 


zf-CCHC 


Zinc knuckle 


2.1e-0D 


11 A 
31.4 


o 
z 


184:202- 
219 


1618 


rnaseH 


RNaseH 


6.3e-14 


59.7 


1 


24-144 


1618 


Integrase_Zn 


Integrase Zinc binding domain 


i.oe-u/ 


11 o 
3 /.Z 


1 
1 


146.1 RS 

l 4 ! O-l Q J 


1618 


DUF224 


Domain of unknown function 

/TNT TT7*>' /, > A \ 

(DUF224) 


9.3 




1 

1 


104 186 


1641 


adh short 


short chain dehydrogenase 


/l 10 




1 


42-309 


1667 


XI ink 


Extracellular link domain 


9 Qp 87 


90ft fi 


2 


162- 

267:273- 
364 


1667 


ig 


Immunoglobulin domain 




25.2 




61-145 


1682 


rvt 


xveverse rranscripiase 


3.1e-31 


117.2 




56-238 


1683 


Gag p30 


Uag r 3U core sneii protein 


9 Op 77 

Z.7C-JJ 


194 0 




8-197 


1689 


KRAB 


KKAd DOX 


4 Op 99 


ou.u 




266-306 


1692 


ubiquitin 


Ubiquitin family 


0.00061 


26.5 




17-91 


1709 


fibrinogen_C 


Fibrinogen beta and gamma chains, C- 
term 


7.9e-85 


295.2 




37-255 


1713 


HOK GEF 


Hok/gef family 


2.4 


-7.8 




7-54 


1716 


Gag_p30 


Gag P30 core shell protein 


0.0036 


-49.7 




64-229 


1721 


rnaseH 


RNaseH 


0.011 


-11.7 




207-350 


1722 


dUTPase 


dUTPase 


0.37 


-22.9 




93-217 
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Table 4 



SEQ 

ED 

NO: 


Pfam Model 


Description 


E-value 


Score 


l id in 


of the 
Domain 


1725 


ig 


Immunoglobulin domain 




57 ft 




80- 

141:259- 
320 


1 /ZD 




ik£ caunoauiin-Dmuiiig mo in 


4.3e-05 


30.4 




49-69 


1727 j 


pkinase 


Protein kinase domain 


3e-21 


84.0 




71-267 


1728 


Fringe 


Fringe-like 


S 0 
j.y 


-112.6 




165-370 


1734 




Irnrnunoglobulin domain 


ft 014 


22.0 


1 


117-170 


1737 


PP2C 


Protein phosphatase 2C 


0.0067 


-50.5 




37-273 


1738 


SH3 


SH3 domain 


i . / e-u j 


71 7 




102-159 


1740 


rnaseH 


RNaseH 


0.0042 


-7.3 


; 


126-270 


1744 


DAGJ^bind 


Phorbol esters/diacylglycerol binding 
dom 


2.9 


-11.1 




26-55 


1744 


PHD 


PHD-finger 


3.3 


-14.7 




9-61 


1760 


GARS_N 


Phosphoribosylglycinamide synthetase, 
N 


o.z 


-oz.u 






1760 


Armadillojseg 


Armadillo/beta-catenin-like repeat 


9.1 


0./ 


z 


AA 

0*f .iJl 

171 

i r i 


1778 


7tm_l 


-— : 

7 transmembrane receptor (rhodopsm 

family) 


1 e> 1 7 


55 7 
JJ. / 


1 

1 


41-276 


1778 


YCF9 


YCF9 1 


3.1 


-18.5 




203-258 


1787 


Clq 


Clq domain 


le-05 


13.2 


1 


111-230 


1787 


Collagen 


Collagen triple helix repeat (20 copies) 


0.0043 


-3.0 


J 


DU-1U/ 


1789 


jmjC 


jmjC domain 


0.00078 


lz.0 




DZ-Z41 


1795 




Immunoglobulin domain 


0.0037 


23.9 




64-141 


1796 


rve 


Integrase core domain 


2.6e-28 


107.5 




20-174 


1802 


zf-C2H2 


Zinc finger, C2H2 type 


6e-15 


63.1 


2 


68- 

AO 

yu:iUo- 
i 7n 


1806 


Filamin 


Filamin/ ABP280 repeat 




1 Q < 


i 


OA 1 71 


1812 




Ankyrin repeat 


3.oe-Zi 


Gf\ A 


i 
j 


1 50 

iQi .ons. 
237:244- 
276 


1824 


PHD 


TlTTT\ -CZ^m 

riUD-iinger 






"1 


62-110 


1826 


PAP assoc 


PAP/25A associated domain 


1.5e-06 


35.2 


i 


101-155 


1827 




Immunoglobulin domain 


l.O 


17 4 




29-102 


1830 


RhoGEF 


RhoGEF domain 


3.3e-06 


24.0 


i 1 


110-280 


1830 


PH 


PH domain 


2.8 


6.7 




356-451 


1833 


zf-CCHC 


Zinc knuckle 


z.le-Uo 




J 


177 1 54 


1833 


rvt 


Reverse transcriptase 


*"J T,. f\C. 

7.7e-0o 


zyy 




84 077 
54-Z / / 


1844 


UCH-2 


Ubiquitin carboxyl-terminal hydrolase 
family 


0.15 


-8.5 


i 


165-238 


1846 


Armadillo_seg 


ArmadOlo^eta-catenin-like repeat 


0.28 


ITT 
it. 1 


Z 


01 *Q7- 


1860 


zf-CCHC 


Zinc knuckle 


3.2e-05 


30.8 


1 


179-196 


1864 


zf-C3HC4 


Zmc finger, C3HC4 type (KINO 


fl OA77 
U.wZZ 


77 1 


1 


218-256 


1887 


ig 


Immunoglobulin domain 


4e-08 


40.4 


1 


35-112 


1889 


LRR 


Leucine Rich Repeat 


0.051 


20.1 


1 


62-85 


1895 


rnaseH 


RNaseH 


3.4e-06 


25.8 


1 


47-177 


1899 


Brevenin 


Brevenin/esculentin/gaegurin/rugosin 
family 


7.5 


-2.9 


1 


1-51 


1911 


UPAR LY6 


u-PAR/Ly-6 domain 


1.3e-06 


35.4 


1 


44-117 
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Table 4 



SEQ 
ID 

NO: 


Pfam Model 


Description 


E-value 


Score 


No: of 

Pfam 

Domains 


Position 
of the 
Domain 


1911 


toxin 


Snake toxin 


3 


-19.5 


1 


66-117 


1911 


Activin recp 


Activin types I and II receptor domain 


9.5 


-14.0 


1 


30-118 


1912 


rvp 


Retroviral aspartyl protease 


7 


-26.3 


1 


42-142 


1913 


SAM 


SAM domain (Sterile alpha motif) 


3.9e-13 


57.1 


2 


105- 

170:183- 
247 


1916 


Sema 


Sema domain 


1.4e-14 


54.6 


1 


51-434 


1926 


PAP2 


PAP2 superfamily 


2.9e-07 


37.6 


1 


48-142 


1930 


ig 


Immunoglobulin domain 


2.7e-07 


37.6 




41-1 16 


1935 


rve 


Integrase core domain 


2.5e-13 


57.7 


1 


1-138 


1940 


maseH 


RNaseH 


l.le-26 


102.0 




24-153 


1940 


Integrase Zn 


Integrase Zinc binding domain 


4.7e-12 


53.5 




155-194 


1952 


LRRNT 


Leucine rich repeat N-terminal domain 


0.0027 


24.4 


1 


67-95 


1953 


UQ con 


Ubiqmtin-conjugating enzyme 


2.8e-08 


40.9 


1 


78-219 


1954 


Peptidase M10 


Matrixin J 


6.7e-86 


298.8 


1 


53-212 


1954 


&2 


Fibronectin type II domain 


le-79 


278.2 


3 


231- 
272:289- 
330:347- 
388 


1958 


ras 


Ras family 


1.9 


-132.0 




215-284 


1963 


tsp 1 


Thrombospondin type 1 domain 


0.083 


8.0 


1 


20-63 


1966 


rvt 


Reverse transcriptase 


1.5e-05 


21.9 


1 


2-196 


1968 


G-patch 


G-patch domain 


0.3 


6.0 




307-352 


1968 


rvp 


Retroviral aspartyl protease 


1.4 


-19.9 


1 


274-385 


1970 


rve 


Integrase core domain 


0.78 


-16.8 




265-395 


1973 


Phage integrase 


Phage integrase family 


5.7e-08 


39.9 


i 


1-153 


1974 


Sigma54 activat 


Sigma-54 interaction domain 


3.1e-37 


137.2 


1 


63-253 


1975 


Na Pi cotrans 


Na+/Pi-cotransporter 


0.0085 ! 


-99.2 




1-146 


1975 


signal 


His Kinase A (phosphoacceptor) 
domain 


7 


-7.7 


■ 


85-147 


1978 


UPAR LY6 


u-PAR/Ly-6 domain 


1.8 


-16.0 


4 


21-96 


1978 


Zn_clus 


Fungal Zn(2)-Cys(6) binuclear cluster 
domain 


5.1 


-5.7 




21-60 


1987 


pro isomerase 


Cyclophilin type peptidyl-prolyl cis-tr 


1.2e-18 


75.4 


i 


4-171 


1997 


zf-CCHC 


Zinc knuckle 


1.9e-05 


31.5 


2 


181- 

198:204- 
220 


1997 


TFHD-31 


Transcription initiation factor HD, 
31kDsn 


7.9 


-63.3 


1 


75-187 


1997 


Gag_pl2 


Gag polyprotein, inner coat protein pi 2 


8.9 


-9.5 


1 


155-229 


1998 


KRAB 


KRAB box 


2e-23 


91.2 


1 


27-65 . 


2001 


CH 


Calponin homology (CH) domain 


0.019 


10.8 




230-330 


2001 


SAM 


SAM domain (Sterile alpha motif) 


0.9 


6.5 


1 


248-311 


2008 


tsp 1 


Thrombospondin type 1 domain 


0.013 


15.1 


1 


64-98 


2011 


ig 


Irnmunoglobulin domain 


1.7e-05 


31.7 


1 


186-255 


2011 


kazal 


Kazal-type serine protease inhibitor 
domain 


0.00028 


27.6 


1 


121-168 


2011 


IGFBP 


Insulin-like growth factor binding 
protein 


0.17 


2.5 




53-113 


2011 


zf-UBRl 


Putative zinc finger in N-recognin 


8.3 


-24.0 




54-112 


2015 


PH 


PH domain 


0.0002 


28.1 




174-281 


2015 


efhand 


EF hand 


0.00031 


27.5 




339-367 


2018 


RPEL 


RPEL repeat 


1.3 


11.8 




25-50 


2034 


maseH 


RNaseH 


4e-27 


10^.6 




122-267 
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SEQ 


Pf am Model 


Description 


E-value 


Score 


No: of 
Pfam 


Position 
of the 


ID 

NO: 
2038 


gramilin 


Granulin 


7.7 


-17.8 


Domains 
1 


Domain 
62-91 


2052 
2057 


rve 

Pep_M12Bjprop 
J£ 


Integrase core domain , 

Reprolysin family propeptide 


0.44 
8.7e-14 


-29.3 
59.2 


1 
1 

1 


160-314 
179-263 

1-140 


2058 
2074 


rve 

zf-C2H2 


Integrase core domain 

Zinc ringer, C2H2 type 


5.5e-22 


86.5 


3 


42- 

66:72- 

96:102- 


2074 


zf-BED 


BED zinc ringer 


0.94 


1 c 
1.0 


i 
i 


124 

91-129 


2074 
2076 


TP1 
LRR 


Nuclear transition protein 1 
Leucine Rich Repeat 


7.5 

3.2e-20 


2.2 
80.6 


1 
5 


1 21-76 
57- 

80:81- 
104:105- 
128:129- 
152:153- 


2076 
2076 


LRRNT 
LRRCT 


Leucine rich repeat N-teaminal domain 
Leucine rich repeat ^terminal domain 


0.00013 
0.047 


28.8 
18.0 


1 


176 

27-55 

186-234 
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1076 




1061 


1061 


1061 


1061 




1050 


1050 




lavl 




2tbv 




lcwv 


lciu 




JO 






> 




O 


> 


> 






> 


> 


CHAIN 
ID 


§ 




u> 
vo 


u> 

ON 


L/i 

o 


u» 

4^ 




* 

On 


>— » 

ON 


START 
AA 


to 
VI 




-J 
o 


IO 

to 
v> 


to 
to 

ON 


t— ' 
^4 

to 




4> 


Lt 

to 




u> 

to 
o 

1 




vo 
on 

to 


vo 

ON 

a 
t 

i— » 

it 


vo 
On 

CO 
1 

UJ 


VO 
On 
i 




w 

1 


3.4e-06 


Psi 
Blast 






0.44 


0.21 


0.11 


0.02 




-0.70 


-0.68 


Verify 
score 






-0.19 I 


-0.20 


1 

p 

VO 


i -0.19 




0.42 


0.41 


PMF 
score 


61.14 


















SEQFOLD 
score 


APOLIPOPROTEIN A-I; CHAIN: 
A, B, C, D; 




VIRUS TOMATO BUSHY 
STUNT VIRUS 2TB V 4 


CYCLODEXTRIN 
GLUCANOTRANSFERASE; 
CHAIN: A, B; 


INVASIN; CHAIN: A; 


CYCLODEXTRIN 
j GLYCOSYLTRANSFERASE; 
ICIU 6 CHAIN: NULL; ICIU 7 




FUMARYLACETOACETATE 
HYDROLASE; CHAIN: A, B; 


FUMARYLACETOACETATE 
HYDROLASE; CHAIN: A, B; 


Compound 


LIPID TRANSPORT APO A-I; 
LIPOPROTEIN, LIPID 
TRANSPORT, CHOLESTEROL 
METABOLISM, 2 
ATHEROSCLEROSIS, HDL, 
LCAT-ACTIVATION 






GLYCOSYLTRANSFERASE 
TRANSFERASE, 
GLYCOSYLTRANSFERASE, 
CALCIUM, SIGNAL 


STRUCTURAL PROTEIN 
INTEGRIN-BINDING PROTEIN, 
INVGENE 


GLYCOSIDASE CGTASE; ICIU 8 
THERMOSTABLE ICIU 14 




HYDROLASE 
BETADIKJBTONASE, FAA; 
MIXED BETA-SANDWICH ROLL, 
; HYDROLASE 


HYDROLASE 
BETADIKETONASE, FAA; 
MIXED BETA-SANDWICH ROLL, 
HYDROLASE 
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8 



•2 _ 

leg 



el 



5! 



pa 



2* 



2 



-i gi- 
ft 



8 5 



o 






w 2 

-pis 




o » 



6 



DO 



1- 
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n 

o 

I 

o 
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3 
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1102 


1102 


1102 


1102 


1102 


1102 




Iklo 


Iklo 


lezg 


& 

OO 


»-» 

o 

N 
00 


n> 
N 

oo 








> 


> 


> 


> 


CHAIN 
ID 


C\ 

oo 


It 
o\ 




CO 
OO 


u> 
to 


to 

00 
VO 


START 
AA 


OJ 
4* 
tO 


u> 
o 


ft 

o 


o 


u> 

VO 

to 


u> 

ON 
OO 


Si 


n 
i 

o 


u> 
to 
n 

b 

00 


4*. 

8? 
6 

OO 


OO 

? 

»— » 

to 


t— » 

4u 
<D 
i 

o 


to 
io 

I 

VO 


Psi 
Blast 


0.20 


0.02 


0.20 


0.70 


1.07 


1.12 


Verify 
score 


cb 
Co 


-0.19 


-0.17 


r 

O 
t— » 


o 
b 
to 


-0.11 


PMF 
score 














SEQFOLD 
score 


LAMININ; CHAIN: NULL; 




T 


l 

J 
J 


THERMAL HYSTERESIS 
PROTEIN ISOFORM YL-1; 
CHAIN* A R- 


THERMAL HYSTERESIS 
PROTEIN ISOFORM YL-1 ; 
CHAIN: A, B; 


THERMAL HYSTERESIS 
PROTEIN ISOFORM YL-1; 
CHAIN: A, B; 


(. 
J 


THERMAL HYSTERESIS 
PROTEIN ISOFORM YL-1; 


Compound 


GLYCOPROTEIN 
GLYCOPROTEIN 


GLYCOPROTEIN 
GLYCOPROTEIN 


ANTIFREEZE PROTEIN INSECT 
ANTIFREEZE PROTEIN, 
THERMAL HYSTERESIS, 
TENEBRIO 2 MOLITOR, 
lODINATION, RIGHT-HANDED 
BETA-HELIX, TMAFP 


ANTIFREEZE PROTEIN INSECT 
ANTIFREEZE PROTEIN, 
THERMAL HYSTERESIS, 
TENEBRIO 2 MOLITOR, 
lODINATION, RIGHT-HANDED 
BETA-HELDC, TMAFP 


ANTIFREEZE PROTEIN INSECT 
ANTIFREEZE PROTEIN, 
THERMAL HYSTERESIS, 
TENEBRIO 2 MOLITOR, 
lODINATION, RIGHT-HANDED 
BETA-HELDC, TMAFP 


ANTIFREEZE PROTEIN INSECT 
ANTIFREEZE PROTEIN, 
THERMAL HYSTERESIS, 
TENEBRIO 2 MOLITOR, 
lODINATION, RIGHT-HANDED 
BETA-HELDC, TMAFP 
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t 



to 



5 



2* 

P CM 



2 



8 



SO 

'I 




So ^ S 




~ ^ C 



1 s ; 



8 




ft 
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I 

o 
n 
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S3 

88 
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1110 
1110 


1110 


OUT 


Olll 


1110 


1109 


► 
< 










lajj j 


lajj 


lkxi 




eg 


> > 


> 


> 






> 




CHAIN 
ID 


51 Jo 


>— » 

vo 
-J 


VO 
Ln 


>— * 

VO 


(— » 

VO 
-4 






START 
AA 


to to 
Ul to 


to 
to 
VO 


to 
O 


to 

UJ 
LO 


to 
to 
VO 






El 


w to 
to vo 

<b £ 

vo O 


n> 
i 

O 
vo 


bo 

? 
o 
vo 


VO 

I 

VO 


9.6e-10 


0.0029 




Psi 
Blast 


o r* 

t— » o 


0.43 


0.15 


0.04 


0.35 


-0.03 




Verify 
score 


0.49 
-0.14 


0.49 


0.01 


-0.03 


0.84 


0.04 




PMF 
score 
















SEQFOLD 
score 


LIPOPROTEIN RECEPTOR 
RELATED PROTEIN; CHAIN: 
A; 

LOW-DENSITY LIPOPROTEIN 


i 

i 


LOW-DENSITY LIPOPROTEIN 
PPrPPTOR- rWATNt A: 


LOW DENSITY LIPOPROTEIN 
RECEPTOR RELATED 
PROTEIN; CHAIN: A; 




LOW-DENSITY LIPOPROTEIN 
RECEPTOR: CHAIN: NULL: 


LOW-DENSITY LIPOPROTEIN 
RECEPTOR; CHAIN: NULL; 


CARDIOTOXIN V; CHAIN: A, 
B; 


ACETYLCHOLINE 1 ABT 4 
RECEPTOR (NMR, 4 
STRUCTURES) 1ABT 5 


Compound 


SIGNALING PKOIEIN LKjANU 
BINDING, CALCIUM BINDING, 
COMPLEMENT-LIKE REPEAT, 2 
RECEPTOR, SIGNALING 
PROTEIN 

LIPID BINDING PROTEIN LDL 


SIGNALING PROTEIN LKo , 
RECEPTOR, LDLR, CYSTEINE- 
RICH MODULE, CALCIUM 
LIGAND- 2 BINDING, FAMILIAL 
HYPERCHOLESTEROLEMIA 


LIPID alNUJJNU FKOIEJUN 
RECEPTOR, LIGAND BINDING, 
CALCIUM BINDING, LDLR, LRP, 
LIPID 2 BINDING PROTEIN 


RECEPTOR LR5; KilUbF 1 OK, 
LDL RECEPTOR, CYSTEINE- 
RICH MODULE, CALCIUM 


RECEPTOR LR5; RECEPTOR, 
LDL RECEPTOR, CYSTEINE^ 
RICH MODULE, CALCIUM 
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SEQFOLD 
score 


HIV-1 REVERSE 
TRANSCRIPTASE (A-CHAIN); 


HIV-1 REVERSE 
TRANSCRIPTASE (A-CHAIN); 
CHAIN: A; HIV-1 REVERSE 
TRANSCRIPTASE (B-CHAIN); 
CHAIN: B; 


LECTIN (AGGLUTININ) 
WHEAT GERM AGGLUTININ 
aSOLECTIN 2) 9WGA 3 


LECTIN (AGGLUTININ) 
WHEAT GERM AGGLUTININ 
flSOLECTIN2)9WGA3 


LECTIN (AGGLUTININ) 
WHEAT GERM AGGLUTININ 
aSOLECTTN2)9WGA3 


LOW-DENSITY LIPOPROTEIN 
RECEPTOR; ILDL 4 CHAIN: 
NULL; ILDL 5 


LOW-DENSITY LUrUPKUlUUN 
RECEPTOR; CHAIN: A; 


LOW-DENSITY LIPOPROTEIN 
RECEPTOR; CHAIN: A; 


RECEPTOR; CHAIN: A; 


Compound 


Q £ 

if 

p 


TRANSFERASE HIV-1 REVERSE 
TRANSCRIPTASE, AIDS, NON- 
NUCLEOSIDE INHIBITOR, 2 
DRUG DESIGN 








BE! 

J Q C 

sag 
li 

a- 


RECEPTOR; BETA HAIRPIN, 3-10 
HELIX, CALCIUM BINDING 


RECEPTOR; BETA HAIRPIN, 3-10 
HELIX, CALCIUM BINDING 
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score 


TROPOMYOSIN; CHAIN: A, B, 




NUCLEOTIDYLTRANSFERASE 
REVERSE TRANSCRIPTASE 
(E.C.2.7.7.49) 3HVT 3 


HIV-1 REVERSE 
TRANSCRIPTASE; IVRT 4 
CHAIN: A, B; IVRT 5 


HIV-1 REVERSE 
TRANSCRIPTASE; IVRT 4 
CHAIN: A,B; IVRT 5 


HIV-1 REVERSE 
TRANSCRIPTASE; IRTH 4 
CHAIN: A, B; IRTH 5 


HIV-1 REVERSE 
TRANSCRIPTASE; IRTH 4 
CHAIN: A, B; IRTH 5 


TRANSCRIPTASE (AMINO- 
TERMINAL HALF) (FINGERS 
1HAR3 AND PALM 
SUBDOMAINS)(RT216) 
(E.C.2.7.7.49) 1HAR4 


Compound 


CONTRACTILE PROTEIN 
TROPOMYOSIN COILED-COIL 
ALPHA-HELICAL, 






NUCLEOTIDYLTRANSFERASE 
HIV-1 RT; IVRT 6 HIV-1 
REVERSE TRANSCRIPTASE 
IVRT 15 


NUCLEOTIDYLTRANSFERASE 
HIV-1RT; IVRT 6 HIV-1 
REVERSE TRANSCRIPTASE 
IVRT 15 


NUCLEOTIDYLTRANSFERASE 
HIV-1 RT; IRTH 6 HIV-1 
REVERSE TRANSCRIPTASE 
IRTH 15 


NUCLEOTIDYLTRANSFERASE 
HIV-1 RT; IRTH 6 HIV-1 
REVERSE TRANSCRIPTASE 
IRTH 15 
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score 
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5 so 
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B*0801; CHAIN: A; BETA-2 
MICROGLOBULIN; CHAIN: B; 
HIV-1 GAG PEPTIDE 
(GGKKKYKL - INDEX 
PEPTIDE); CHAIN: C; 


PEPTIDE); CHAIN: C; 


B*0801; CHAIN: A; BETA-2 
MICROGLOBULIN; CHAIN: B; 
HIV~1 GAG PEPTIDE 


HFE; CHAIN: A, C; BETA-2- 
MICROGLOBULIN; CHAIN: B, 
D 


C 


HLA-DR3; CHAIN: A, B; CLIP; 




Compound 


| IMMUNE SYSTEM HLA-DR2, | 


COMPLEX (MHC 
PROTEIN/ANTIGEN) DRA, DRBl 
01010; COMPLEX (MHC 
PROTEIN/ANTIGEN), 
HISTOCOMPATIBILITY 
ANTIGEN 


HISTOCOMPATIBILrrY 
COMPLEX B8; B2M; PEPTIDE 
HLA B8, HIV, MHC CLASS I, 
HISTOCOMPATIBILrrY 
COMPLEX 


HISTOCOMPATIBILITY 
COMPLEX B8; B2M; PEPTIDE 
HLA B8, HIV, MHC CLASS I, 
HISTOCOMPATIBILITY 
COMPLEX 


MHC CLASS I COMPLEX HFE, 
HEREDITARY 

HEMOCHROMATOSIS, MHC 
CLASS I 


COMPLEX 

(TRANSMEMBRANE/GLYCOPRO 
TEIN) MHC GLYCOPROTEIN, 
COMPLEX 

(TRANSMEMBRANE/GLYCOPRO 
TEIN) 


1 

Q 

I 

w 


PDB annotation 



Printed from Mimosa 05/1 1/28 15:54:20 Page: 248 



WO 03/080795 



248 



PCT7US02/25485 



1193 


1193 


1193 


1193 




3 w 23 


lhoc 




1 


Icdl 






> 


> 


> 


> 




CHAIN 
ID 


to 

VO 


to 

VO 


to 

VO 


US 




START 
AA 


K> 
4^ 
Ch 


ON 


to 
to 

On. 


LIZ 




El 


OO 

o 

1 

oo 

-o 


I— ■ 
VO 


o\ 

VO 


i 




Psi 
Blast 




0.49 


0.46 


-0.31 




Verify 
score 




1.00 


1.00 


0.11 




PMF 
score 


64.64 










SEQFOLD 
score 


HISTOCOMPATroiLITY 
ANTIGEN MURINE CLASS I 
MAJOR 

HISTOCOMPATIBILITY 
COMPLEX CONSISTING IHOC 
3 0FH-2D=B=, B2- 


HLA-CW3 (HEAVY CHAIN); 
CHAIN: A; BETA-2- 
MICROGLOBULIN; CHAIN: B; 
PEPTIDE FROM IMPORTIN 
ALPHA-2; CHAIN: C; 
NATURAL KILLER CELL 
RECEPTOR KIR2DL2; CHAIN: 
D,E; 


HLA-A*0201; CHAIN: A, D; 
BETA-2 MICROGLOBULIN; 
CHAIN: B, E;HTLV-1 
OCTAMERIC TAX PEPTIDE; 
CHAIN: C, F; 


CDl; CHAIN: A, B, C,D; 


DR2; CHAIN: B, E; HLA-DR2; 
CHAIN: C, F; 


Compound 




IMMUNE SYSTEM MHC, HLA, 
CLASS I, KIR, NK CELL 
RECEPTOR, IMMUNOGLOBULIN 
2 FOLD, RECEPTOR/MHC 
COMPLEX 




IMMUNE SYSTEM 
IMMUNOGLOBULIN FOLD 


CD1MCD1D.1;CD1, 
IMMUNOLOGY, MHC, TCR, 
GLYCOPROTEIN, SIGNAL, 2 
IMMUNOGLOBULIN FOLD, T- 
CELL 


MYELIN BASIC PROTEIN, 
MULTIPLE SCLEROSIS, 2 
AUTOIMMUNITY, IMMUNE 
SYSTEM 
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PLATELET FACTOR 4 IPLF 3 


PF4-M2 CHIMERA; IPFM 7 
CHAIN: A, B, C, D; IPFM 8 


MACROPHAGE 
INFLAMMATORY PROTEIN-2; 
CHAIN: A, B; 


CHEMOKINE(GROWTH 
FACTOR) HUMAN 
MELANOMA GROWTH 
STIMULATING ACTIVITY 
(MGSA/GRO ALPHA) IMGS 3 
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ULPl PROTEASE; CHAIN: A; 
UBITQUTIN-LIKE PROTEIN 
SMT3; CHAIN: B; 


SUMO-1; CHAIN: NULL; 




NEUTROPHIL ACTIVATING 
PEPTIDE 2 VARIANT; CHAIN: 
A, B, C, D; 


NEUTROPHIL ACTIVATING 
PEPTIDE 2 VARIANT; CHAIN: 
A, B, C,D; 


PLATELET FACTOR 
PLATELET FACTOR 4 (HPF4) 
(HUMAN RECOMBINANT) 
IRHP 3 




Compound 


HYDROLASE SUMO 
HYDROLASE, UBIQUITIN-LIKE 
PROTEASE 1, SMT3 
HYDROLASE 2 
DESUMOYLATING ENZYME, 
CYSTEINE PROTEASE, SUMO 
PROCESSING 3 ENZYME, SMT3 
PROCESSING ENZYME, NABH4, 
THIOHEMIACETAL, 4 


TARGETING PROTEIN PIC1, 
GMP1, UBLl, SENTRIN; SUMO-1, 
POST-TRANSLATTONAL 
PROTEIN MODIFICATION, 2 
UBIQUITIN-LIKE PROTEINS, 
TARGETING PROTEIN 




CYTOKINE NAP-2; CYTOKINE 


CYTOKINE NAP-2; CYTOKINE 




HUMAN CHEMOKINE GROB[5- 
73], CXC CHEMOKINE 
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I ANTIBODY CTMOl; CHAIN: L, 


FAB FRAGMENT CTMOl; 
CHAIN: L, H, A, B; 


IMMUNOGLOBULIN, DIELS 
ALDER CATALYTIC 
ANTIBODY; CHAIN: L, H, A, B; 


IGG2A; CHAIN: L, H; HUMAN 
RHINOVIRUS CAPSID 
PROTEIN VP2; CHAIN: P; 




MHC CLASS I NK CELL 
RECEPTOR PRECURSOR; 
CHAIN: A; 
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| IMMUNOGLOBULIN 


IMMUNOGLOBULIN 1 
IMMUNOGLOBULIN, FAB 
FRAGMENT 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, j 
ANTIBODY, CATALYTIC 
ANTIBODY, DIELS ALDER, 2 
GERMLINE 


COMPLEX 

(IMMUNO GLOBULIN/VIRAL 
PEPTIDE) ANTIBODY 8F5; 
IMMUNOGLOBULIN, 
ANTIBODY, RHINOVIRUS, 
NEUTRALIZATION, 2 
CONTINUOUS EPITOPE, 
COMPLEX 

(IMMUNO GLOBULIN/VIRAL 
PEPTIDE) 




IMMUNE SYSTEM P58 
NATURAL KILLER CELL 
RECEPTOR; KIR, NATURAL 
KILLER RECEPTOR, 
INHIBITORY RECEPTOR, 2 
IMMUNOGLOBULIN 




COVALENT PROTEASE ADDUCT 
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SEQFOLD 
score 


| HIV-1 REVERSE 


IMMUNOGLOBULIN IGG2A 
FAB FRAGMENT (CNJ206) 
2GFB3 


IMMUNOGLOBULIN FAB 
FRAGMENT OF A 
HUMANIZED VERSION OF 
THE ANTI-CD 18 2FGW 3 
ANTIBODY 'H52' (HUH52-OZ 
FAB) 2FGW4 




PTR1.9 FAB; CHAIN: L, H; 


SM3 ANTIBODY; CHAIN: L, H; 
PEPTIDE EPITOPE; CHAIN: P; 


MONOCLONAL ANTIBODY 
3A2; CHAIN: H, L; 1 


MONOCLONAL ANTIBODY 
3A2; CHAIN: H, L; 


(HEAVY CHAIN); CHAIN: B 3 D; 


Compound 


| COMPLEX (RT/DNA/FAB) HIV-1 | 


I 




IMMUNOGLOBULIN TR1.9, 
ANTI-THYROID PEROXIDASE, 
AUTOANTIBODY, 2 
IMMUNOGLOBULIN 


COMPLEX (ANT1BOD Y/FfcJP 11UJ* 
EPITOPE) ANTIBODY, PEPTIDE 
ANTIGEN, ANTITUMOR 
ANTIBODY, 2 COMPLEX 
(ANTIBODY/PEPTIDE EPITOPE) 


MONOCLONAL ANTIBODY 
MONOCLONAL ANTIBODY, 
FAB-FRAGMENT, 
REPRODUCTION 


MONOCLONAL ANTIBODY 
MONOCLONAL ANTIBODY, 
FAB-FRAGMENT, 
REPRODUCTION 
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SEQFOLD 
score 


| ALPHA-1 SYNTROPHIN 


NEURONAL NITRIC OXIDE 
SYNTHASE (RESIDUES 1-13( 
CHAIN: A; 


HUMAN DISCS LARGE 
PROTEIN; CHAIN: NULL; 


HCASK/LIN-2 PROTEIN; 
CHAIN: A, B; 




PSD-95; CHAIN: A; CRIPT; 
CHAIN: B; 


si 

DC 


NEURONAL NITRIC OXIDE 
SYNTHASE; CHAIN: A; 




EBOLA VIRUS ENVELOPE 
PROTEIN CHIMERA 
CONSISTING CHAIN: A, B, C, 
D,E,F; 




PEPSIN; CHAIN: NULL; ' 


Compound 


























| MEMBRANE J 


OXIDOREDUCTASE BETA- ! 
FINGER 


SIGNAL TRANSDUCTION HDLG, 
DHR3 DOMAIN; SIGNAL 
TRANSDUCTION, SH3 DOMAIN, 
REPEAT 


KINASE HCASK, GLGF REPEAT, 
DHR; PDZ DOMAIN, NEUREXIN, 
SYNDECAN, RECEPTOR 
CLUSTERING, KINASE 


PEPTIDE RECOGNITION 
PEPTIDE RECOGNITION, 
PROTEIN LOCALIZATION 


OXIDOREDUCTASE PDZ 
DOMAIN, NNOS, NITRIC OXIDE 
SYNTHASE 




VIRUS/VIRAL PROTEIN 
MEMBRANE FUSION SUBUNIT, 
VIRUS/VIRAL PROTEIN 




ASPARTYL PROTEASE ACID 
PROTEINASE; ASPARTYL 
PROTEASE, ACID PROTEINASE, 
HYDROLASE 
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SEQFOLD 
score 


INTEGRIN BETA-4 SUBUNIT; 
CHAIN: A, B; _ 


INTEGRIN BETA-4 SUBUNIT; 
CHAIN: A, B; 


INTEGRIN BETA-4 SUBUNIT; 
CHAIN: A, B; 


INTEGRIN BETA-4 SUBUNIT; 
CHAIN: A, B; 


FIBRONECTIN; CHAIN: NULL; 


FIBRONECTIN; CHAIN: NULL; 






Compound 


STRUCTURAL PROTEIN 
INTEGRIN, HEMIDESMOSOME, 
FIBRONECTIN, CARCINOMA, 


STRUCTURAL PROTEIN 
INTEGRIN, HEMIDESMOSOME, 
FIBRONECTIN, CARCINOMA, 
STRUCTURAL 2 PROTEIN 


STRUCTURAL PROTEIN 
INTEGRIN, HEMIDESMOSOME, 
FIBRONECTIN, CARCINOMA, 
STRUCTURAL 2 PROTEIN 


STRUCTURAL PROTEIN 
INTEGRIN, HEMIDESMOSOME, 
FIBRONECTIN, CARCINOMA, 
STRUCTURAL 2 PROTEIN 


CELL ADHESION PROTEIN CELL 
ADHESION PROTEIN, RGD, 
EXTRACELLULAR MATRIX, 2 
HEPARIN-BINDING, 
GLYCOPROTEIN 


CELL ADHESION PROTEIN CELL 
ADHESION PROTEIN, RGD, 
EXTRACELLULAR MATRIX, 2 
HEPARIN-BINDING, 
GLYCOPROTEIN 


EXTRACELLULAR MATRIX, 2 

HEPARIN-BINDING, 

GLYCOPROTEIN 
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FIBRONECTIN; CHAIN: A; 


TENASCIN; CHAIN: A, B; 


TENASCIN; CHAIN: A, B; 


INTEGRIN BETA-4 SUBUNIT; 
CHAIN: A, B; 


INTEGRIN BETA-4 SUBUNIT; 
CHAIN: A, B; 




Compound 


LYASE ALPHA/BETA FOLD, 
LYASE 




PROTEIN BINDING ED-B, 
FIBRONECTIN, TYPEIII 
DOMAIN, ANGIOGENESIS, 
PROTEIN 2 BINDING 


STRUCTURAL PROTEIN 
TENASCIN, FIBRONECTIN 
TYPE-III, HEPARIN, 
EXTRACELLULAR 2 MATRIX, 
ADHESION, FUSION PROTEIN, 
STRUCTURAL PROTEIN 


STRUCTURAL PROTEIN 
TENASCIN, FIBRONECTIN 
TYPE-m, HEPARIN, 
EXTRACELLULAR 2 MATRIX, 
ADHESION, FUSION PROTEIN, 
STRUCTURAL PROTEIN 


STRUCTURAL PROTEIN 
INTEGRIN, HEMIDESMOSOME, 
FIBRONECTIN, CARCINOMA, 
STRUCTURAL 2 PROTEIN 


STRUCTURAL PROTEIN 
INTEGRIN, HEMIDESMOSOME, 
FIBRONECTIN, CARCINOMA, 
STRUCTURAL 2 PROTEIN 


STRUCTURAL 2 PROTEIN \ 
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50.38 






68.76 










SEQFOLD 
score 


BLOOD COAGULATION 
INHIBITOR ECHISTATIN 
(NMR, 8 STRUCTURES) 2ECH 3 


BLOOD COAGULATION 
INHIBITOR ECHISTATIN 
(NMR, 8 STRUCTURES) 2ECH 3 


BLOOD COAGULATION 
FACTOR XA; CHAIN: L, C; 


AGGREGATION INHIBITOR, 
GP ANTAGONIST KISTRIN 
(NMR, 8 STRUCTURES) IKST 3 


AGGREGATION INHIBITOR, 
GP ANTAGONIST KISTRIN 
(NMR, 8 STRUCTURES) IKST 3 


AGGREGATION INHIBITOR, 
GP ANTAGONIST KISTRIN 
(NMR, 8 STRUCTURES) IKST 3 


LAMININ; CHAIN: NULL; 


FLAVORIDIN; IFVL 4 CHAIN: 
NULL IFVL 5 


NULL IFVL 5 


Compound 






BLOOD COAGULATION 
FACTOR STUART FACTOR; 
BLOOD COAGULATION 
FACTOR, SERINE PROTEINASE, 
EPIDERMAL 2 GROWTH 
FACTOR LIKE DOMAIN 








GLYCOPROTEIN 
GLYCOPROTEIN 


BLOOD COAGULATION 
INHIBITOR GP KB/mA 
ANTAGONIST IFVL 9 


INHIBITOR GP IIB/niA 
ANTAGONIST IFVL 9 
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P58-CL42 KIR; CHAIN: NULL; 
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LOW AFFINITY 
IMMUNOGLOBULIN GAMMA 




o o 
5 o 

g * 

> 


HIGH AFFINITY 
IMMUNOGLOBULIN EPSILON 
RECEPTOR CHAIN: A; 




SS07D; CHAIN: A; DNA; 
CHAIN: B, C; 




ADENOSINE KINASE; CHAIN: 
A; 




Compound 


INHIBITORY RECEPTOR KILLER 
CELL INHIBITORY RECEPTOR; 


IMMUNE SYSTEM RECEPTOR 
BETA SANDWICH, 
IMMUNOGLOBULIN-LIKE, 
RECEPTOR 


IMMUNE SYSTEM, MEMBRANE 
PROTEIN CD32; FC RECEPTOR,' 
IMMUNOGLOULIN, 
LEUKOCYTE, CD32 


IMMUNE SYSTEM FC-EPSILON 
RI-ALPHA; IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN 




HYPERTHERMOPHILE, 
ACHAEABACTERIA, 2 
COMPLEX (DNA-B ENDING 
PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) DNA BINDING 
PROTEIN, 




TRANSFERASE TOXOPLASMA 
GONDII, ADENOSINE KINASE, 
PURINE METABOLISM 
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SEQFOLD 
score 


REGULATOR OF 

CHROMOSOME 

CONDENSATION 




polpolyprote: 


POLPOLYPROTE 
B; 


DEOXYURIDINE 5'- 
TRIPHOSPHATE 
NUCLEOTIDOHYDROLASE; 
CHAIN: A; 


DEOXYURIDINE : 
TRIPHOSPHATE 
NUCLEODITOHY] 
CHAIN: NULL; 


DEOXYURIDINE: 
TRIPHOSPHATE 
NUCLEODITOHY] 
CHAIN: NULL; 




DEHYDROGENASE (APO 
FORM) (B.C. 1.1. 1.29) IGDH 3 


Compoi 


1; CHAIN: A, 




IN; CHAIN: A; 


IN; CHAIN: A, 


5'- 

DROLASE; 


1 ' 

w 




g 

0, 


GUANINE NUCLEOTIDE 
EXCHANGE FACTOR RCC 1 ; 
GUANINE NUCLEOTIDE 
EXCHANGE FACTOR, GEF, RAN, 
2 RAS-LIKE NUCLEAR GTP 




VIRUS/VIRAL PROTEIN EIGHT 
STRANDED BETA BARREL 
PROTEIN 


VIRUS/VIRAL PROTEIN EIGHT 
STRANDED BETA-BARREL 


HYDROLASE DUTPASE; JELLY 
ROLL, MERCURY DERIVATIVE 


HYDROLASE DUTPASE, DUTP 
PYROPHOSPHATASE; 
HYDROLASE, DUTPASE, EIAV, 
TRJMERIC ENZYME, ASPARTYL 
PROTEASE 


HYDROLASE DUTPASE, DUTP 
PYROPHOSPHATASE; 
HYDROLASE, DUTPASE, EIAV, 
TRJMERIC ENZYME, ASPARTYL 
PROTEASE 
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SEQFOLD 
score 


TPR2A-D0MAIN OF HOP; 
CHAIN: A; HSP90-PEPTIDE 


SYNTAXIN BINDING PROTEIN 
1; CHAIN: A; SYNTAXIN 1A; 
CHAIN: B; 


ALPHA SPECTRIN; CHAIN: A 
B,C; 


APOLIPOPROTEIN A-I; CHAIN: 
A, B, C, D; 


PROTEIN KINASE C DELTA 
TYPE; 1PT0 4 


RAF-1; CHAIN: NULL; 


NUCLEOTIDYLTRANSFERASE 
; CHAIN: A, B: 


Compound 


o n < 

0 j£ j 

1 8 t 


? m n 

2 ^ ? c 

its* 

3°9 

2 


STRUCTURAL PROTEIN TWO 
REPEATS OF SPECTRIN, ALPHA 
HELICAL LINKER REGION, 2 2 
TANDEM 3-HELIX COILED- 


LIPID TRANSPORT APO A-I; 

LIPOPROTEIN, LIPID 

TRANSPORT, CHOLESTEROL 

METABOLISM, 2 

ATHEROSCLEROSIS, HDL, 
T PAT- a rrrv a tt/^vt 


PHOSPHOTRANSFERASE 


«P > £ 
0 ^ 00 2 

2P§ 
P 9 a 

§§§ 

ill 


SERINE/THREONINE PROTEIN 
KINASE TRANSFERASE, 


ANTIBIOTIC RESISTANCE, 
TRANSFERASE, PLASMID 
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SEQFOLD 
score 


PHOSPHORYLASE KINASE; 
CHAIN: NULL; 


TWITCHIN; CHAIN: A, B; 


TWITCHIN; CHAIN: NULL; 


SERINE/THREONINE-PROTEIN 
KINASE PAK-ALPHA; CHAIN: 
A, B; SERINE/THREONINE- 
PROTEIN KINASE PAK- 
ALPHA; CHAIN: C, D; 


TRANSFERASE(PHOSPHOTRA 
NSFERASE) CAMP- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) (CAPK) 
ICTP 3 (CATALYTIC 
SUBUNIT) ICTP 4 


PHOSPHOTRANSFERASE 
CAMP-DEPENDENT PROTEIN 
KINASE CATALYTIC SUBUNIT 
ICMK 3 (E.C.2.7.1.37) 1CMK4 


(/S139AS) COMPLEX WITH 
THE PEPTIDE 1APM5 
INHIBITOR PKI(5-24) AND THE 
DETERGENT MEGA-8 1 APM 6 


Compound 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; 
GLYCOGEN METABOLISM, 
TRANSFERASE, 

SERINE/THREONINE-PROTEIN, 2 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


TRANSFERASE KINASE 
DOMAIN, AUTOINHEBITORY 
FRAGMENT, HOMODIMER 
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COAGULATION, PLASMA, 
PLATELET, FIBRINOGEN, 


BLOOD COAGULATION BLOOD 
COAGULATION, FIBRINOGEN- 
420, ALPHAEC DOMAIN, 2 
FIBRINOGEN RELATED 
DOMAIN, GLYCOSYLATED 
PROTEIN 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


FACTOR BLOOD 
COAGULATION, 
GLYCOPROTEIN, CALCIUM, 
PLATELET, PLASMA, 2 
ALTERNATIVE SPLICING, 
SIGNAL, DISEASE MUTATION, 3 
POLYMORPHISM 
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Verify 
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0.55 
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0.47 




PMF 
score 












SEQFOLD 
score 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; CHAIN: 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, D; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; CHAIN: 
E, F, G, H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, D; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; CHAIN: 
E, F, G, H; 


FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; CHAIN: 
E, F, G, H; 


Compound 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl; 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl ; 
FGFRl; IMMUNOGLOBULIN (IG) 
LIKE DOMAINS BELONGING TO 
THE I-SET 2 SUBGROUP WITHIN 
IG-LIKE DOMAINS, B-TREFOIL 
FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS BELONGING 
TO THE I-SET 2 SUBGROUP 
WITHIN IG-LIKE DOMAINS, B- 
TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS BELONGING 
TO THE I-SET 2 SUBGROUP 
WITHIN IG-LIKE DOMAINS, B- 
TREFOIL FOLD 


FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS BELONGING 
TO THE I-SET 2 SUBGROUP 
WITHIN IG-LIKE DOMAINS, B- 
TREFOIL FOLD 
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SEQFOLD 
score 


P53; CHAIN: A; 53BP2; CHAIN: 
B; 


PHOSPHOTRANSFERASE FYN 
PROTO-ONCOGENE 
TYROSINE KINASE 
(E.C.2.7.1.112)1SHF3(SH3 
DOMAIN) 1SHF4 


SEM-5; ISEM 3 CHAIN: A, B; 
ISEM 5 10-RESIDUE PROLINE- 
RICH PEPTIDE FROM MSOS 
ISEM 8 CHAIN: C, D ISEM 10 


ALPHA SPECTRIN; CHAIN: 
NULL; 


PHOSPHORIC DIESTER 
HYDROLASE 

PHOSPHOLIPASE C-GAMMA 
(SID DOMAIN) (E.C.3.1.4.11) 
IHSQ 3 (NMR, MINIMIZED 
MEAN STRUCTURE) IHSO 4 


GROWTH FACTOR BOUND 
PROTEIN 2; IGRI 5 CHAIN: A, 
B; IGRI 6 


Compound 


COMPLEX (ANTI- 
ONCOGENE/ANKYRIN 
REPEATS) P53BP2; ANKYRIN 
REPEATS, SH3, P53, TUMOR 


< 

j 

> 
c 

i 

h 
V 


SIGNAL TRANSDUCTION 
PROTEIN SRC-HOMOLOGY 3 
(SH3) DOMAIN, PEPTIDE- 
BINDING PROTEIN, ISEM 18 2 
GUANINE NUCLEOTIDE 
EXCHANGR FAfTTYYR 1 <3T?1WT 1 0 


CIRCULAR PERMUTANT PWT; 
CIRCULAR PERMUTANT, SH3 
DOMAIN. CYTOSKP.rrcTryw 




SIGNAL TRANSDUCTION 
ADAPTOR SH2, SIB IGRI 14 
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SEQFOLD 
score 


NF-KAPPA-B P65 SUBUl 
CHAIN: A; NF-KAPPA-B 




CYCLIN-DEPENDENT K 
6 INHIBITOR; CHAIN: A, 


CYCLIN-DEPENDENT KJ 
6 INHIBITOR; CHAIN: A, 


CYCLIN-DEPENDENT KJ 
6 INHIBITOR; CHAIN: A, 


CYCLIN-DEPENDENT KI 
6 INHIBITOR; CHAIN: A, 




PYK2-ASSOCIATED PRO 
BETA; CHAIN: A; 


4 INHIBITOR B; CHAIN: i 




Compound 


P50D 




w 


CO 

W 


cd g 
- > 


dd 2 
" > 

CO 

w 






j? 






TRANSCRIPTION FACTOR P65; 
P50D; TRANSCRIPTION 


. INHIBITOR 


CELL CYCLE INHIBITOR P18- 
INK4C(INK6); CELL CYCLE 
INHIBITOR, P18-INK4C(INK6), 
ANKYRIN REPEAT, 2 CDK 4/6 


CELL CYCLE INHIBITOR PI 8- 
INK4C(INK6); CELL CYCLE 
INHIBITOR, P18-INK4C(INK6), 
ANKYRIN REPEAT, 2 CDK 4/6 
INHIBITOR 


CELL CYCLE INHIBITOR P18- 
INK4C(INK6); CELL CYCLE 
INHIBITOR, P18-INK4C(INK6), 
ANKYRIN REPEAT, 2 CDK 4/6 
INHIBITOR 


CELL CYCLE INHIBITOR P18- 
INK4C(INK6); CELL CYCLE 
INHIBITOR, P18-INK4C(INK6), 
ANKYRIN REPEAT, 2 CDK 4/6 
INHIBITOR 


METAL BINDING PROTEIN 
ZINC-BINDING MODULE, 
ANKYRIN REPEATS, METAL 
BINDING PROTEIN 


TURN-HELIX, ANKYRIN 
REPEAT 
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SEQFOLD 
score 


| COXSACKIE VIRUS AND 




RAB-3A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B; 




P53; CHAIN: A; 53BP2; CHAIN: 
B; 




Compound 


| VIRUS/VIRAL PROTEIN 




COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB3A; 
COMPLEX (GTP- 
BINDING/EFFECTOR), G 
PROTEIN, EFFECTOR, RABCDR, 
2 SYNAPTIC EXOCYTOSIS, RAB 
PROTEIN, RAB3A, RABPHILIN 




COMPLEX (ANTI- 
ONCOGENE/ANKYRIN 
REPEATS) P53BP2; ANKYRIN 
REPEATS, SH3, P53, TUMOR 
SUPPRESSOR, MULTIGENE 2 
FAMILY, NUCLEAR PROTEIN, 
PHOSPHORYLATION, DISEASE 
MUTATION, 3 POLYMORPHISM, 
COMPLEX (ANTI- 
ONCOGENE/ANKYRIN 
REPEATS) 


FAMILY, NUCLEAR PROTEIN, 
PHOSPHORYLATION, DISEASE 
MUTATION, 3 POLYMORPHISM, 
COMPLEX (ANTT- 
ONCOGENE/ANKYRIN 
REPEATS) 
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SEQFOLD 
score 


SOS 1; CHAIN: NULL; 


GRPl; CHAIN: A; 


RHO-GEF VAV; CHAIN: A; 




1 HUMAN SOS 1: CHAIN: A: ) 


HUMAN SOS 1; CHAIN: A; 




1 PDC: CHAIN: A: 1 


SOS 1; CHAIN: NULL; 






ALPHA, BETA T-CELL 
RECEPTOR CHAIN: A. B: 


RECEPTOR BETA CHAIN; . 
CHAIN: E; 


Compound 


SIGNAL TRANSDUCTION SON 


SIGNALING PROTEIN ARPl 
GUANINE NUCLEOTIDE 
EXCHANGE FACTOR AND PH 
DOMAIN 


SIGNALING PROTEIN 11 ALPHA- 
HELICES 


GENE REGULATION SON OF 
SEVENLESS PROTEIN; GUANINE 
NUCLEOTIDE EXCHANGE 
FACTOR, GENE REGULATION 


GENE REGULATION SON OF 
SEVENLESS PROTEIN; GUANINE 
NUCLEOTIDE EXCHANGE 
FACTOR, GENE REGULATION 


TRANSPORT PROTEIN RHO- 
GTPASE EXCHANGE FACTOR, 
TRANSPORT PROTEIN 


SIGNAL TRANSDUCTION 
SIGNAL TRANSDUCTION, SOS, 
PLECKSTRIN HOMOLOGY (PH) 
DOMAIN 




RECEPTOR TCR; T-CELL, 
RECEPTOR, TRANSMEMBRANE, 
GLYCOPROTEIN, SIGNAL 
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SEQFOLD 
score 


HYDROLASE(ENDORIBONUC 
LEASE) RIBONUCLEASE H 


HYDROLASE(ENDORIBONUC 
LEASE) RIBONUCLEASE H 
(E.C.3.1.26.4) 1RIL3 


RIBONUCLEASE HI; CHAIN: A; 


HIV-1 REVERSE 
TRANSCRIPTASE (CHAIN A); 
CHAIN: A; HIV-1 REVERSE 
TRANSCRIPTASE (CHAIN B); 
CHAIN: B; ANTIBODY (LIGHT 
CHAIN); CHAIN: L; ANTIBODY 
(HEAVY CHAIN); CHAIN: H; 
DNA (5'- CHAIN: T; DNA (5*- 
CHAIN: P; 


HIV-1 REVERSE 
TRANSCRIPTASE (A-CHAIN); 
CHAIN: A; HIV-1 REVERSE 
TRANSCRIPTASE (B-CHAIN); 
CHAIN: B; 


INTEGRASE; CHAIN: A; 


Compound 

B,D: 






HYDROLASE RNASE H, 
NUCLEASE, RNASEH*, 
RIBNUCLEASE H, METAL- 
BINDING 2 PROTEIN, PROTEIN 
FOLDING 


TRANSFERASE/IMMUNE 
SYSTEM/DNA HIV-1 RT; HIV-1 
RT; HIV, REVERSE 
TRANSCRIPTASE, MET184ILE, 
3TC, PROTEIN-DNA 2 COMPLEX, 
DRUG RESISTANCE, M 1 841, 
TRANSFERASE/IMMUNE 3 
1 SYSTEM/DNA 


TRANSFERASE HIV-1 REVERSE 
TRANSCRIPTASE, AIDS, NON- 
NUCLEOSIDE INHIBITOR, 2 
DRUG DESIGN 


\l 

3 w 

0 
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SEQFOLD 
score 


1 

O 


ALPHA-1 SYNTROPHIN 
(RESIDUES 77-171); CHAIN: A; 
NEURONAL NITRIC OXIDE 


NEURONAL NITRIC OXIDE 
SYNTHASE (RESIDUES 1-130); 
CHAIN: A; 


HCASK/LIN-2 PROTEIN; 
CHAIN: A, B; 


PSD-95; CHAIN: A; CRIPT; 
CHAIN: B; 


NEURONAL NITRIC OXIDE 
SYNTHASE; CHAIN: A; 
HEPTAPEPTIDE: CHAIN: R- 


TYROSINE PHOSPHATASE 
(PTP-BAS, TYPE 1); CHAIN: A; 

... 


ALPHA-1 SYNTROPHIN 
(RESIDUES 77-171); CHAIN: A; 
NEURONAL NITRIC OXIDE 
SYNTHASE (RESIDUES 1-130); 
CHAIN: B: 




Compound 




MEMBRANE 

PROTEIN/OXIDOREDUCTASE 
BETA-FINGER. HETERODTMFB 


OXIDOREDUCTASE BETA- 
FINGER 


KINASE HCASK, GLGF REPEAT, 
DHR; PDZ DOMAIN, NEUREXIN, 
SYNDECAN, RECEPTOR 


PEPTIDE RECOGNITION 
PEPTIDE RECOGNITION, 

PI? HTPTXr T AP ATT7A TTrvxT 


OXIDOREDUCTASE PDZ 
DOMAIN, NNOS, NITRIC OXIDE 


HYDROLASE PDZ DOMAIN, 
HUMAN PHOSPHATASE, 
HPTPIE, PTP-BAS, SPECIFICITY 
2 OF BINDING 


MEMBRANE 

PROTEIN/OXIDOREDUCTASE 
BETA-FINGER, HETERODIMER 


TRANSDUCTION, SIB DOMAIN, 
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57.76 


55.30 






SEQFOLD 
score 


GELATINASE A; CHAIN: A; 




UBIQUITIN CONJUGATING 
ENZYME; CHAIN: NULL; 


UBIQUITIN CONJUGATING 
ENZYME; CHAIN: NULL; 


UBIQUITIN CONJUGATING 
ENZYME; CHAIN: NULL; 


UBIQUITIN CONJUGATING 
ENZYME; CHAIN: NULL; 


UBIQUITIN CONJUGATING 
ENZYME; CHAIN: NULL; 


UBC9; CHAIN: NULL; 


Compound 


HYDROLASE MMP-2J2KD TYPE 
IV COLLAGENASE; 
HYDROLASE 




UBIQUITIN CONJUGATION 
UBC7; UBIQUITIN 
CONJUGATIOR LIGASE. YEAST 


UBIQUITIN CONJUGATION 
UBIQUITIN CONJUGATION, 
UBIQUITIN CARRIER PROTEIN, 
THIOESTER 2 BOND. LIGASE 


UBIQUITIN CONJUGATION 
UBIQUITIN CONJUGATION, 
UBIQUITIN CARRIER PROTEIN, 
THIOESTER 2 BOND. LIGASE 


UBIQUITIN CONJUGATION 
! UBC1; UBIQUITIN 
1 CONJUGATION. LIGASE 


UBIQUITIN CONJUGATION 
UBC1; UBIQUITIN 
CONJUGATIOR LIGASE 


UBIQUmN-CONJUGATING 
ENZYME UBIQUITIN- 
CONJUGATING ENZYME; 
UBIQUITIN-CONJUGATING 
ENZYME, UBIQUTTIN-DIRECTED 
2 PROTEOLYSIS; CELL CYCLE 
CONTROL, LIGASE 
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PMF 
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127.69 












SEQFOLD 
score 


CYCLOPHILIN A; CHAIN: A; 
PEPTIDE FROM THE HIV-1 
CAPSID PROTEIN; CHAIN: B; 


BACTERIOPHAGE PI CRE 
GENE; CHAIN: A, B; DNA (35- 
MER); CHAIN: C, D; 


CRE RECOMBINASE; CHAIN: 
A, B; DNA (35 NUCLEOTIDE 
CRE RECOGINITION SITE); 
CHAIN: C,D; 


CRE RECOMBINASE; CHAIN: 
A, B; DNA; CHAIN: C,D; 


HPl INTEGRASE; CHAIN: A, B, 
C,D; 




Compound 


COMPLEX 

(ISOMERASE/PEPTIDE) 
COMPLEX 

(ISOMERASE/PEPTIDE), 
CYCLOPHILIN A, HIV-1 CAPSID. 


PROTEIN/DNA CRE 
RECOMBINASE, DNA BENDING, 
SITE SPECIFIC 

RECOMBINATION, 2 PROTEIN- 
DNA INTERACTION, 
PROTEIN/DNA 


PROTEIN/DNA CRE 
RECOMBINASE, DNA BENDING, 
RECOMBINATION, PROTEIN- 
DNA 2 INTERACTION, 

PT> fYT v D TXT /PkXT A 


COMPLEX 

(RECOMBINASE/DNA) CRE-HJ2; 
CRE RECOMBINASE, HOLLIDAY 
JUNCTION, RECOMBINATION, 2 
COMPLEX 


DNA INTEGRATION DNA 
INTEGRATION, 

R VCfWAHTKl A TTTYVT 


SITE-SPECIFIC 

ft P POA/TD TXT A TtrtXT 
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97.55 


99.73 


SEQFOLD 
score 


METHYLMALONYL-COA 
MUTASE; CHAIN: A, B, C, D; 


1 
1 

| 

} 

t 

\ 


METHYLMALONYL-COA 
MTJTASF- PHAIM. A u n tv 


METHYLMALONYL-COA 
MUTASE; CHAIN: A, B, C, D; 


HUMAN SKELETAL MUSCLE 
ALPHA-ACTININ 2; CHAIN: A; 


SYNTAXIN-IA; CHAIN: A, B, 
C; 


COMPLEX 

(ISOMERASE/IMMUNOSUPPR 
ESSANT) CYCLOPHILIN C 
COMPLEXED WITH 
CYCLOSPORIN A 2RMC 3 


CYCLOPHILIN B; ICYN 6 
CHAIN: A; ICYN 7 [D- 
(CHOLINYL)ALA]8- 
CYCLOSPORTN; ICYN 10 
CHAIN: C; ICYN ll 


Compound 


ISOMERASE ISOMERASE, 
MUTASE, INTRAMOLECULAR 
TRANSFERASE 


ISOMERASE ISOMERASE, 
MUTASE, INTRAMOLECULAR 
TRANSFERASE 


ISOMERASE ISOMERASE, 1 
MUTASE, INTRAMOLECULAR 


CONTRACTILE PROTEIN 
TRIPLE-HELIX COILED COIL, 
CONTR A mi F PRATPTM 


ENDOCYTOSIS/EXOCYTOSIS 
SYNAPTOTAGMIN ASSOCIATED 
35 KDA PROTEIN, P35A, THREE 


< 


2 PSEUDO-SYMMETRY 
COMPLEX 

(ISOMERASE/IMMUNOSUPPRES 
SANT) CYCLOSPORIN, 
ISOMERASE, ROTAMASE, 
SIGNAL irVM iq 
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SEQFOLD 
score 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; CHAIN: 
QD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 

FACTOR RECEPTOR 1; CHAIN: 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; CHAIN: 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; CHAIN: 
C,D; 


Compound 

AXONIN-1: CHAIN: A: 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, FGFR, 
IMMUNOGLOBULIN-LIKE, 
SIGNAL TRANSDUCTION, 2 
DIMERIZATION. GROWTH 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, FGFR, 
IMMUNOGLOBULIN-LIKE, 
SIGNAL TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, FGFR, 
IMMUNOGLOBULIN-LIKE, 
SIGNAL TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 


ADHESION 

GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, FGFR, 
IMMUNOGLOBULIN-LIKE, 
SIGNAL TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
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SEQFOLD 
score 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, D; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; CHAIN: 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, D; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; CHAIN: 
E, F, G, H; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, C, 
D; 


CHIMERIC GERMLINE 
PRECURSOR OF OXY-COPE 
CHAIN: L; CHIMERIC 
GERMLINE PRECURSOR OF 
OXY-COPE CHAIN: H: 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; CHAIN: 
C, D; 




Compound 


h5 3 O ► 

g f? 3 3 c 

i 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LEKE DOMAINS BELONGING 
TO THE I-SET 2 SUBGROUP 
WITHIN IG-LDCE DOMAINS, B- 
TRPPnrr vr\j r\ 


CELL ADHESION NCAM; NCAM, 
IMMUNOGLOBULIN FOLD, 
gt .vmpp OTPTM 


IMMUNE SYSTEM IMMUNE 
SYSTEM 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, FGFR, 
IMMUNOGLOBULIN-LIKE, 
SIGNAL TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 

RPPPPT/YD 


FACTOR/GROWTH FACTOR 
RECEPTOR 
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53.60 






SEQFOLD 
score 


NERVE GROWTH FACTOR; 
CHAIN: V, W; TRKA 


TWITCHIN 18TH IGSF 
MODULE; CHAIN: NULL; 


MUSCLE PROTEIN TIITN 
MODULE M5 (CONNECTIN) 
ITNM 3 (NMR, MINIMIZED 
AVERAGE STRUCTURE) ITNM 
4 ITNM 58 


THROMBIN; CHAIN: L, H, J, K; 
RHODNEN; CHAIN: R, S; 


TITIN; CHAIN: NULL; 


IMMUNOGLOBULIN 
HETEROLOGOUS LIGHT 
CHAIN DIMER IMCW 3 
(/MCG$VWEIR$ HYBRID) 
IMCW 4 


Compound 


NERVE GROWTH 
FACTOR/TRJKA COMPLEX 


MUSCLE PROTEIN 
IMMUNOGLOBULIN 
SUPERPAMILY, I SET, MUSCLE 


1 

{ 

2 


COMPLEX (SERINE 
PROTEASE/INHIBITOR) 
COMPLEX (SERINE 
PROTEASE/INHIBITOR), KAZAL- 


MUSCLE PROTEIN CONNECTIN, 
NEXTM5; CELL ADHESION, 
GLYCOPROTEIN, 
TRANSMEMBRANE, REPEAT, 
BRAIN, 2 IMMUNOGLOBULIN 
FOLD, ALTERNATIVE SPLICING, 

SIGNAL. 3 MTJSPT.P PPHTRTXT 
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SEQFOLD 
score 


CALMODULIN; CHAIN: NULL; 
CARDIAC N-TROPONIN C: 




TROPONIN C: CHAIN: NUT J/ 


PARVALBUMIN; CHAIN: A, B 


LECTIN (AGGLUTININ) 
WHEAT GERM AGGLUTININ 
(ISOLECTIN2) 9WGA3 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN LAMBDA 
LIGHT CHAIN DIMER (/MCG$) 
2MCG 3 (TRIGONAL FORM) 
2MCG4 


RECEPTOR; CHAIN: X, Y; 


Compound 


CALCIUM-BINDING PROTEIN 
CALMODULIN CERIUM TR1C- 
DOMAIN, RESIDUES 1 - 75; 
CERIUM-LOADED, CALCIUM- 
BINDING PROTEIN 
CALCIUM-BINDING CNTNC: 


MUSCLE PROTEIN CTNC; 
CARDIAC, MUSCLE PROTEIN, 
REGULATORY, CALCIUM 


CALCIUM BINDING PROTEIN 
CALCIUM BINDING PROTEIN, 
MTISn P PPnrrttrxT 




I 

I 

i 
I 


BETA-NGF; COMPLEX, TRKA 
RECEPTOR, NERVE GROWTH 
FACTOR, CYSTEINE KNOT, 2 
IMMUNOGLOBULIN LIKE 
DOMAIN, NERVE GROWTH 
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63.96 










SEQFOLD 
score 


CALMODULIN; CHAIN: A; 
RS20; CHAIN: B; 


CALCIUM-BINDING PROTEIN 
RAT ONCOMODULIN i Rum 


PHOSPHOLIPASE C DELTA-l; 
CHAIN: NULL; 




PHOSPHOLIPASE C DELTA-l; " 
CHAIN: NTJLT ■ 


CALCIUM-BINDING PROTEIN 
NCS-1; CHAIN: A; 


CALMODULIN; CHAIN: A; 




Compound 


CALMODULIN, CALCIUM 
BINDING, HELIX-LOOP-HELIX 
SIGNALLING, 2 

COMPLEX/CALCIUM-BINDING 


i 

\ 
3 
c 
t 


SIGNAL TRANSDUCTION 
PROTEIN PLECKSTRIN, 
PHOSPHOLIPASE, INOSITOL 
TRISPHOSPHATE, 2 SIGNAL 
TRANSDUCTION PROTEIN, 

HVnPHT A or: 


SIGNAL TRANSDUCTION 
PROTEIN PLECKSTRIN, 
PHOSPHOLIPASE, INOSITOL 
TRISPHOSPHATE, 2 SIGNAL 
TRANSDUCTION PROTEIN, 

WVTYDriT a otr 


METAL BINDING PROTEIN 
YEAST FREQUENIN EF-HAND, 


I 

5 


HAND CALCIUM-BINDING 
PROTEIN, PROTEIN- 2 
COELENTERA2INE PEROXIDE 
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SEQFOLD 
score 


INTEGRASE; CHAIN: A, B; 


RSV INTEGRASE; CHAIN: A, B; 


INTEGRASE; CHAIN: A, B, C, 
D; 


INTEGRASE; CHAIN: A, B, C, 
D; 


INTEGRASE; CHAIN: A, B, C; 


INTEGRASE; CHAIN: A; 


INTEGRASE; CHAIN: A; 


Compound 


VIRUS/VIRAL PROTEIN SH3- 
LIKE DOMAIN, NONSPECIFIC 


VIRUS/VIRAL PROTEIN 
INTEGRASE, ROUS SARCOMA 
VIRUS, HIV, X-RAY 
CRYSTALLOGRAPHY, 2 
VIRUS/VIRAL PROTEIN 


TRANSFERASE INTEGRASE, 
ROUS SARCOMA VIRUS, HIV, X- 
RAY CRYSTALLOGRAPHY, 2 
PROTEIN STRUCTURE, 
TRANSFERASE 


TRANSFERASE INTEGRASE, 
ROUS SARCOMA VIRUS, HIV, X- 
RAY CRYSTALLOGRAPHY, 2 
PROTEIN STRUCTURE, 
TRANSFERASE 


DNA INTEGRATION DNA 
INTEGRATION, AIDS, 
POLYPROTEIN, HYDROLASE, 2 
ENDONUCLEASE, 
POLYNUCLEOTIDYL 
TRANSFERASE, DNA BINDING 3 
(VIRAL) 


TRASFERASE DNA 
INTEGRATION. TRASFERASR 


TRANSFERASE DNA 
INTEGRATION 
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SEQFOLD 1 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


QGSR ZINC FINGER PEPTIDE; 
CHAIN: A; DUPLEX 
OLIGONUCLEOTIDE BINDING 
SITE; CHAIN: B, C; 


QGSR ZINC FINGER PEPTIDE; 
CHAIN: A; DUPLEX 
OLIGONUCLEOTIDE BINDING 
SITE; CHAIN: B, C; 


QGSR ZINC FINGER PEPTIDE; 
CHAIN: A; DUPLEX 
OLIGONUCLEOTIDE BINDING 
I SITE; CHAIN: B, C; 


QGSR ZINC FINGER PEPTIDE; 
CHAIN: A; DUPLEX 
OLIGONUCLEOTIDE BINDING 
SITE; CHAIN: B, C; 


OLIGONUCLEOTIDE BINDING 
SITE; CHAIN. B, C; 


Compound 


COMPLEX (ZINC FINGER/DNA) 
ZINC FINGER, PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 


COMPLEX (ZINC FINGER/DNA) 
ZINC FINGER, PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX (ZINC 
FINGER/DNA) 


COMPLEX (ZINC FINGER/DNA) 
COMPLEX (ZINC FINGER/DNA), 
ZINC FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC FINGER/DNA) 
COMPLEX (ZINC FINGER/DNA), 
ZINC FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC FINGER/DNA) 
COMPLEX (ZINC FINGER/DNA), 
ZINC FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC FINGER/DNA) 
COMPLEX (ZINC FINGER/DNA), 
ZINC FINGER, DNA-BINDING 
PROTEIN 


ZINC FINGER, DNA-BINDING 
PROTEIN 
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NUCLEASE INHIBITOR; 
N: A, D; ANGIOGENIN; 
N: B,E; 




FINGER PROTEIN GLI1; 
N: A; DNA; CHAIN: C, D; 


FINGER PROTEIN GLI 1; 
M: A; DNA; CHAIN: C, D; 




Compound 


COMPLEX (NUCLEAR 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP.RIBONUCLEOPROTEIN 


RECOGNITION, EPITOPE 
MAPPING, LEUCINE-RICH 3 
REPEATS 


COMPLEX 

(INHIBrrOR/NUCLEASE) 
COMPLEX 

(INHIBITOR/NUCLEASE), 
COMPLEX (RI-ANG), 
HYDROLASE 2 MOLECULAR 




COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE-FINGER 
GLI; GLI, ZINC FINGER, 
COMPLEX (DNA-BINDING 
PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE-FINGER 
GLI; GLI, ZINC FINGER, 
COMPLEX (DNA-BINDING 
PROTEIN/DNA) 


GLI; GLI, ZINC FINGER, 
COMPLEX (DNA-BINDING i 
PROTEIN/DNA) 
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25 


0 948 
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19 
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14 
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1149 


40 
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OO 
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23 
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u.9oJ 


A QC 1 
U.OJ 1 


i i on 
1187 


10 


a nm 


A OC/1 


1 i no 
llOO 


1 o 

1 / 


n o/ia 


A 750 

V./oy 


1 1 on 
1 lo9 


1 o 
lo 


A QOC 


A 78A 
U. / OH 


1 1 aa 

nyu 


1 0 

lo 


0 QAC 

u.yoj 


A 7^1 


1 1 ni 

nyi 


0*2 

z3 


n oca 


A AQA 


1 1 no 

119z 


31 


A OQO 

u.yyz 


A CAT 


1 1 

1 15/3 


oc 
ZD 


A OOI 

u.yyi 


A 048 


1 1 fM 

1194 


on 
zU 


A Q07 

u.yz / 


A A17 
U.Ol / 


1 1 AC 

1195 


uc 
2o 


A OQA 

u.yoo 


A QOC 


1 1 OA 
11V0 


OA 


A QCQ 

u.ooy 


A A1 8 
U.OlO 


1 1 07 
11V / 


OO. 


A OQO 


A 87T 
U.O/O 


1 1 oc 

i iyo 




A OOI 


A 81 5 
U.ol 3 


1 1 oo 

nyy 


1 c 
lo 


A Q5C 


A 0^6 

u.yjo 


1 OA1 

IzUl 


0 


A OQC 1 
U.OOO 


A CAA 


t OAO 


oo 

Z£> 


A OCO 


A 77A 


1 OAO. 


OO 

zy 


A OI A 

u.y 10 


A 7A7 
U. / U/ 


1 OA/1 
lzU4 


00 

zz 


A QAf\ 


A 8AA 
U.oUU 


1 OA* 
IZuD 


1 A 
10 


A 888 
U.OOO 


A A46 


1206 


21 


0.908 


0.558 


1207 


27 


0.953 


0.564 


1208 


43 


0.969 


0.757 


1209 


27 


0.965 


0.891 


1212 


19 


0.976 


0.809 


1213 


20 


0.988 


0.872 


1214 


31 


0.987 


0.871 
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Table 6 



SEQ ID NO': 


Pnci+inn /if ^icmsil 
IU5IUUU 111 ijlglldl 

PptyHHp 


Maximum score 


Average score 


1215 


18 


0.989 


0.880 


1916 


34 


0.920 


0.550 


191R 
izio 


20 


0.957 


0.870 


1910 


25 


0.928 


0.615 


1990 

1ZZU 


18 


0.989 


0.955 


1971 


14 


0.892 


0.686 


1 ZZZ 


21 


0.979 


0.940 


1991 

1 ZZJ 


24 


0.979 


0.930 


1 994 


49 
tz 


0.983 


0.771 


1 99S 


99 
zz 


0.982 


0.811 


1 996 
1 zzo 


91 

Z 1 


0.945 


0.794 


1 997 
1ZZ f 


1 < 
i j 


0.969 


0.910 


1 99Q 

1 ZZ7 


16 
1 u 


0.916 


0.622 


1 910 


90 

Z7 


0.972 


0.769 


1 919 


14 


0.945 


0.836 


1 911 


10 


0.963 


0.669 


1914 


9Q 

Z7 


0.989 


0.867 


19 IS 
IZj J 


14 


0.977 


0.891 


17^6 


16 


0.934 


0.673 


1717 
lZO / 


19 


0 922 


0.720 


J ZOO 


99 
ZZ 


o oso 


0.828 


1770 


99 
ZZ 


0 0S6 

U.7JU 


0.763 


n An 

1Z4U 


94 
Z*f 


0 0X1 


0.938 


17A1 


10 


ft R01 

U.Q7 1 


0.574 


1 7A7 
1Z4Z 


17 
3Z 


0 074 


0.869 


1 7A1 


11 


0 ROD 


0.675 


17AA 


7S 

ZJ 


0 014 


0.593 


1 9A< 


77 

ZZ 


0.944 


0.709 


17A6 


10 


0 040 


0.714 




70 

Z7 


0.889 


0.658 


1 7 AC 


10 


0 RR1 


0.749 


1 7AQ 


74 
z*t 


0 R09 


0.577 


IZDU 


71 
Z 1 


0 016 

U.71U 


0.662 


1ZD1 


7Q 

zy 


n 091 

U.7Z 1 


0.601 


iZDZ 


17 


0 OS4 


0.741 


1 7<\7 


77 
Z / 


0 RRR 


0.738 


1 7 5A 


7R 
zO 


0 0R1 


0.920 


17S6 


76 
zu 


0 07S 


0.705 


17S7 


10 
i y 


0.914 


0.698 


19S8 

1 Z JO 


18 


0.961 


0.869 


19SQ 


41 


0.962 


0.600 


1960 
izov 


18 


0.947 


0.664 


1961 

1ZU1 


18 


0.946 


0.739 


1969 
1ZOZ 


90 

zv 


0.889 


0.561 


1961 


11 


0.973 


0.865 


1964 


1 o 


0.956 


0.850 


196S 

1ZUJ 


14 


0.952 


0.875 


1966 

1ZUU 


29 


0.902 


0.563 


1267 


20 


0.966 


0.739 


1268 


23 


0.953 


0.688 


1269 


38 


0.919 


0.676 


1270 


27 


0.955 


0.826 


1271 


23 


0.913 


0.702 


1273 


21 


0.972 


0.915 


1274 


23 


0.950 


0.578 
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Table 6 



SEQ ED NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1275 


20 


0.996 


0.965 


1276 


20 


0.976 


0.937 


1278 


26 


0.962 


0.752 


1279 


38 


0.962 


0.756 


1280 


19 


0.991 


0.929 


1281 


27 


0.948 


0.670 


1282 


22 


0.932 


0.790 


1283 


23 


0.962 


0.679 


1285 


30 


0.888 


0.573 


1286 


15 


0.996 


0.988 


1287 


27 


0.992 


0.893 


1288 


24 


0.952 


0.685 


1289 


36 


0.953 


0.605 


1290 


32 


0.932 


0.649 


1291 


24 


0.990 


0.935 


1292 


24 


0.973 


0.940 


1293 


20 


0.965 


0.811 


1294 


18 


0.977 


0.957 


1296 


24 


0.987 


0.903 


1297 


12 


0.894 


0.780 


1298 


29 


0.899 


0.623 




19 


0.882 


0.753 




33 


0.996 


0.905 


1301 


21 


0.952 


0.663 


1302 


19 


0.984 


0.937 


1303 


32 


0.978 


0.885 


1305 


18 


0.985 


0.736 


1306 


46 


0.991 


0.888 


1308 


27 


0.996 


0.933 


1309 


24 


0.970 


0.913 


1310 


27 


0.930 


0.778 


1312 


16 


0.990 


0.959 


1313 


18 


0.949 


0.767 


1314 


18 


0.896 


0.752 


1315 


18 


0.984 


0.888 


1316 


21 


0.953 


0.721 


1317 


35 


0.923 


0.688 


1318 


27 


0.940 


0.796 


1319 


26 


0.990 


0.837 


1320 


24 


0.972 


0.663 


1321 


18 


0.969 


0.722 


1323 


21 


0.955 


0.709 


1324 


21 


0.979 


0.935 


1325 


26 


0.944 


0.675 


1326 


29 


0.931 


0.569 


1327 


18 


0.997 


0.955 


1329 


24 


0.985 


0.845 


1330 


43 


0.901 


0.602 


1331 


32 


0.965 


0.699 


1332 


15 


0.881 


0.608 


1334 


32 


0.896 


0.556 


1335 


18 


0.963 


0.807 


1336 


19 


0.909 


0.593 


1337 


16 


0.885 


0.562 


1338 


18 


0.911 


0.688 
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Table 6 



SEQ ID NO: 


Position of Sifmal 
Peptide 


Maximum score 


Average score 


1339 


24 


0.980 


0.847 


1340 


25 


0.943 


0.774 


1341 


20 


0.973 


0.778 


1342 


27 


0.924 


0.686 


1343 


24 


0.914 


0.585 


1344 


16 


0.957 


0.773 


1345 


15 


0.906 


0.798 


1346 


16 


0.971 


0.855 


1347 


24 


0.980 


0.901 


1348 


23 


0.965 


0.642 


1349 


22 


0.899 


0,609 


1350 


18 


0.940 


0.585 


1351 


19 


0.985 


0.935 


1352 


22 


0.945 


0.718 


1353 


20 


0.943 


0.728 


1354 


15 


0.887 


0.721 


1355 


16 


0.915 


0.737 


1358 


21 


0.948 


0.585 


1360 


30 


0.911 


0.555 


1361 


20 


0.976 


0.851 


1362 


19 


0.927 


0.791 




19 


0.947 


0.574 


1JUJ 


28 


0.997 


0.786 


1 166 




0.979 


0.855 


1 

IJO / 


9? 


U.07J 


0.577 


lJUO 


I? 


0.956 


0.829 


1 


16 


0.929 


0.739 


1370 


17 


0.931 


0.745 


1^71 

1J f 1 


30 


0.950 


0.708 


1372 


28 


0.968 


0.856 


1373 


26 


0.953 


0.711 


1375 


32 


0.983 


0.842 


1376 


19 


0.929 


0.689 


1377 


30 


0.899 


0.631 


u / o 


25 


0.927 


0.775 


H7Q 
i«? / ✓ 


19 


0.982 


0.922 


1380 


28 


0.940 


0.628 


1381 


20 


0.890 


0.610 


1382 


28 


0.921 


0.606 


1383 


23 


0.881 


0.644 


1384 


24 


0.978 


0.911 


1385 


21 


0.974 


0.723 


1386 


26 


0.980 


0.795 


1387 


16 


0.903 


0.654 


1388 


20 


0.912 


0.596 


1389 


19 


0.981 


0.960 


1390 


25 


0.932 


0.790 


1391 


15 


0.990 


0.963 


1395 


18 


0.942 


0.709 


1396 


28 


0.963 


0.844 


1397 


19 


0.972 


0.882 


1398 


21 


0.966 


0.827 


1399 


21 


0.962 


0.752 


1400 


25 


0.979 


0.855 


1402 


23 


0.913 


0.685 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1403 


19 


0.935 


0.829 


1404 


21 


0.984 


0.958 ' 


1405 


27 


0.888 


0.566 


1406 


36 


0.945 


0.564 


1407 


19 


0.938 


0.755 


1408 


22 


0.947 


0.745 


1409 


16 


0.909 


0.728 


1410 


20 


0.961 


0.866 


1412 


22 


0.991 


0.926 


1413 


20 


0.911 


0.683 


1414 


15 


0.905 


0.737 


1416 


13 


0.933 


0.799 


1417 


46 


0.956 


0.728 


1418 


20 


0.945 


0.782 


1419 


19 


0.987 


0.953 


1420 


30 


0.976 


0.862 


1421 


24 


0.964 


0.796 


1423 


23 


0.924 


0.645 


1425 


19 


0.913 


0.670 


1476 


33 


0.968 


0.774 


1477 

lHi f 


22 


0.941 


0.632 


147R 


15? 


0.972 


0.935 


147Q 


15 


0.978 


0.909 


1430 


26 


0.926 


0.713 


1431 


26 


0.915 


0.659 


1432 


21 


0.949 


0.790 


1433 


27 


0.996 


0.854 


1434 


26 


0.910 


0.590 


1436 


21 


0.983 


0.793 j 


1437 


18 


0.932 


0.643 


1438 


21 


0.908 


0.583 


1439 


24 


0.925 


0.742 


1440 


IS 


0.909 


0.736 


1441 


30 


0.883 


0.615 


1442 


37 


0.960 


0.714 


1444 


30 


0.942 


0.586 


1445 


24 


0.904 


0.640 


1446 


26 


0.950 


0.724 


1447 


15 


0.956 


0.757 


1448 


30 


0.906 


0.692 


1449 


21 


0.933 


0.751 


1450 


25 


0.990 


0.855 


1451 


20 


0.893 


0.775 


1452 


26 


0.952 


0.729 


1453 


44 


0.990 


0.654 


1454 


20 


0.974 


0.810 


1455 


21 


0.960 


0.679 


1456 


17 


0.926 


0.629 


1457 


23 


0.982 


0.940 


1458 


18 


0.986 


0.938 


1459 


22 


0.940 


0.617 


1460 


18 


0.939 


0.698 


1461 


39 


0.997 


0.955 


1462 


11 


0.989 


0.626 


1463 


16 


0.972 


0.911 
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Table 6 



SEQIDNO: 


Position of Signal 
r epnoe 


irl HaA II 1 U III aLUi c 


Avpi*90p crnfp 


14£< 

140J 


17 
1 / 


ft 048 


0.855 


140O 




ft 0ft1 


0.739 


140/ 


zu 


n Q60 


0.883 


14/>Q 
1408 


0£ 
zo 


ft ofti 


0.585 


i4oy 


1 8 
10 


ft 014 


0.710 


1 Aid 
14 /U 


01 


ft 072 


0.908 


14/1 


10 
iy 


ft 042 


0.626 


14 I j 


OS 

Z.) 


0 072 


0.670 


1474 

14 /4 


K 
13 


0 017 


0.810 


147S 
14/J 


4U 


ft 021 


0.825 


1/177 


01 

Zi 


0.914 


0.589 


14 /o 


06 
ZU 


0 964 


0.721 


1470 
14 /y 


10 


0 936 


0.624 


1 4C1 
1451 




0.995 


0.943 


1 /to 

148Z 


Oft 


ft 0QS 


0.959 


1 AQA 
1484 


10 
iy 


ft 064 


0.755 


14QC 
1483 


1 c 


ft 0S6 


0 R47 


1400 


97 
Z/ 


ft 061 
u.yoj 


ft SR4 


1 A Q1 

148/ 


ZJ 


ft Q41 
u.y4i 


ft 7R1 


MOD 

1488 


JZ 


ft QAQ 
U.yoy 


ft R16 

U.OlO 


i a on 
1489 


OQ 

zy 


ft OS£ 

u.yjo 


ft 740 

U. /4Z 


1491 


on 
zu 


ft QQ4 

u.oyn 


ft 61 S 
U.OlJ 


1492 


1 A 

34 


n ooo 


ft A£R 
U.OOo 


1493 


1 <r 
16 


n o/io 


ft onn 


1494 


i n 
19 


u.yoy 


ft R7R 


1495 


1*7 
11 


U.y44 


' ft 70^ 
u. /ZO 


1490 


A C 

43 


u.yi j 


ft ^RR 
U.OOo 


1 Adn 
149/ 


43 


ft GftR 

u.yuo 


ft SRI 

U.JOJ 


i /inn 
1499 


43 


ft QR7 

u.yo/ 


ft ROft 
U.OZU 


1 cnn 


zu 


ft 070 

u.y /z 


ft 70ft 


1 ^ni 


1 4 
14 


ft RR1 


ft 617 


i 


04 
Z4 


ft 071 


ft 7R6 


1 ^ft4 
1 DU4 


1 A 


ft 001 

U.7ZJ 


ft 7S0 


i cnc 


oo 

zz 


ft Q£S 

u.yoj 


ft R00 

U.OZ7 


1 CAT 

1DU/ 


4j 


ft 006 
u.yyo 


ft 0ft7 


1 cnn 


01 
Zl 


ft 048 
u.y40 


ft 710 

\J. I JZ 


1 CIA 

1D1U 


oo 

zs 


ft 060 
v.yoz 


ft R00 


1011 


04 


ft 001 
u.yz i 


ft 646 


1^10 
I D1Z 


1 0 

iy 


ft QSO 
u.yjy 


ft 7S^ 




40 


ft 060 


0.628 


1 S14 
1 »14 


01 
Zl 


ft 00R 


0.717 


1 <;i <: 
IjID 


IO 


ft 006 
u.yzo 


ft 711 


IjIO 


13 


ft RRS 


ft 661 


1<17 
131 / 


01 
Zl 


ft 01S 


ft 70S 


1318 


Ol 


ft Q4S 


ft RS0 


1 C 1 O 
1D19 


1 0 


ft RR1 

U.OO 1 


ft 616 


1 ^Oft 
13ZU 


on 
zu 


ft 040 
u.y4y 


ft 7ft4 


1 <\01 
13Z1 


01 
Z 1 


ft 01R 


0 74S 


1522 


20 


0.977 


0.923 


1523 


23 


0.925 


0.619 


1524 


20 


0.933 


0.728 


1525 


11 


0.912 


0.784 


1526 


29 


0.907 


0.656 


1527 


18 


0.962 


0.704 


1528 


42 


0.977 


0.817 



Printed from Mimosa 05/11/28 15:58:17 Page: 417 



WO 03/080795 



PCT/US02/25485 



417 

> 



Table 6 



SEQIDNO: 


Position nf Sional 
Peotide 


Maximum score 


Average score 


1 590 


37 


0.960 


0.623 




22 


0.899 


0.649 




22 


0.943 


0.663 


1 ^Tl 


20 


0.970 


0.936 


1 V*4 


28 


0.934 


0.607 


1 


30 


0.989 


0.890 


1 ^fi 


16 


0.984 


0.932 


1 ^7 


22 


0.992 


0.974 


1 <1R 


35 


0.976 


0.622 


1 

i jjy 


20 


0.901 


0.576 


1 ^4fl 


28 


0.944 


0.697 


1 54? 


28 


0,936 


0.667 




25 1 


0.891 


0.550 


1 544 


21 


0.967 


0.700 




31 


0.938 


0.649 


1 54fi 


21 


0.883 


0,569 


1 547 


29 


0.953 


0.614 | 


1 548 


12 


0.916 


0.815 ( 




23 


0.955 


0.658 




21 


0.948 


0.635 




10 

1 7 


0.956 


0.835 




1 R 


0.960 


0.803 


1 C</l 


jj 


0.920 


0.577 






0.947 


0.717 




^1 


0.898 


0.658 


1 


94 


0.960 


0.876 


1 <;<c 

1 JJO 


9^ 


0.985 


0.878 


1 s&f\ 


JO 


0.919 


0.553 


1 


19 


0.942 


0.841 


1 <<o 


91 


0.887 


0.568 


1 Z<1 


1Q 


0.990 


0.928 


1304 


1R 

A O 


0.950 


0.814 


1 <£7 
00/ 


96 


0.970 


0.822 




14 


0.928 


[ 0.806 


ID f\) 


9fi 


0.998 


0.969 




1R 

AO 


0.911 


0.762 


1 <T> 


28 


0.986 


0.924 


1 ^74 


15 


0.935 


0.815 


1575 


18 


0.955 


0.896 


1 576 


26 


0.949 


0.697 


! 1577 


20 


0.945 


0.856 


1 57R 


24 


0.962 


0.723 


1 57Q 


23 


0.976 


0.716 


15R0 


20 


0.903 


0.597 


1 5R9 


19 


0.880 


I 0.679 




25 


0.984 


0.918 


1 5R4 


22 


0.991 


0.876 


1585 


23 


0.968 


0.710 


1586 


33 


0.894 


0.596 


1587 


23 


0.918 


0.721 


1588 


19 


0.913 


0.703 


1589 


14 


0.951 


0.886 


i 1590 


28 


j °- 887 


0.557 


1591 


[ 26 


0.999 


0.969 


1592 


19 


0.968 


0.865 
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Table 6 



SEQIDNO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1593 


32 


0.962 


0.612 


1SQ4 


22 


0.966 


0.864 


1J7U 


19 


0.970 


0.823 




15 


0.917 


0.825 




32 


0.991 


0.900 


1SQQ 


26 


0.927 


0.693 


IWU 


18 


0.896 


0.656 




16 


0.926 


0.833 


160? 


18 


0.948 


0.883 




18 


0.977 


0.868 


1604 

1 Uvrl 


34 


0.943 


0.730 


, 1606 

1UUU 


15 


0.930 


0.640 


1607 


32 


0.967 


0.697 


1608 
two 


21 


0.922 


0.658 


161 0 


30 


0.881 


0.586 


161 1 
lOl 1 


30 


0.887 


0.667 


101Z 


19 


0.938 


0.565 


i #;n 

101 J 


22 


0.977 


0.894 


1£14 


20 


0.925 


0.725 


101D 


25 


0.972 


0.746 


1£1 6 
1010 


30 


0.986 


0.671 


loiy 


18 


0.917 


0.620 


10ZU 


28 


0.968 


0.611 


1 £71 
10Z1 


29 


0.925 


0.613 


1 £77 
10ZZ 


48 


0.968 


0.711 


' 1 £71 
10ZJ 


24 


0.937 


0.586 


169/1 


19 


0.914 


0.694 


1 675 
lOZJ 


26 


0.906 


0.685 


1 676 
lOZO 


14 


0.962 


0.863 


, lOZ / 


28 


0,976 


0.911 


1670 


17 


0.973 


0.938 


1610 


22 


0.962 


0.919 


161? 


31 


0.997 


0.846 


1611 


25 


0.920 


0.607 


10J*t- 


17 


0.982 


0.945 


161*J 


17 


0.994 


0.968 


1 £18 
ID JO 


30 


0.922 


0.705 


|£1Q 

io^y 


21 


0.952 


0.714 


1640 


21 


0.966 


0.807 


1641 


23 


0.983 


0.821 


1649 


18 


0.953 


0.885 


1643 


16 


0.907 


0.647 


1644 


20 


0.884 


0.650 


164S 


17 


0.959 


0.680 


1646 


18 


0.991 


0.954 


1647 


30 


0.983 


0.786 


164R 


21 


0.886 


0.567 


1649 

1 Kris 


24 


0.894 


0.658 


1650 


23 . 


0.881 


j 0.657 


1651 


27 


0.932 


0.702 


1652 


22 


0.993 


0.885 


1653 


17 


0.990 


0.926 


1654 


19 


1 0.932 


0.622 


1655 


34 


0.931 


0.673 


1656 


19 


0.966 


0.909 
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Position of Signal 
Peptide 


Maximum score 


Average score 


1657 


17 


0.955 


0.867 


1658 


38 


0.954 


0.594 


1659 


19 


0.920 


0.710 


1660 


37 


0.988 


0.598 


1662 


32 


0.909 


0.675 


1664 


16 


0.937 


0.804 


1665 


20 


0.911 


0.621 


1667 


29 


0.981 


0.871 J 


1668 


33 


0.972 


0.869 


1669 


22 


0.968 


0.913 


1670 


23 


0.990 


0.932 


1672 


22 


0.939 


0.716 


1673 


17 


0.963 


0.865 


1674 


38 


0.949 


0.669 


1675 


20 


0.926 


0.787 


1677 


19 


0.938 


0.785 


1678 


20 


0.929 


0.727 


1679 


20 


0.916 


0.604 


1680 


21 


0.967 


0.886 


1681 


20 


0.909 


0.749 


1682 


30 


0.928 


0.776 


1683 


20 


0.916 


0.649 


1684 


21 


0.976 


0.879 


1685 


13 


0.897 


0.645 


1686 


13 


0.994 


0.963 


1687 


17 


0.898 


0.743 


1688 


30 


0.946 


0.638 


1689 


21 


0.996 


0.976 


1690 


18 


0.916 


0.595 


1691 


17 


0.934 


0.754 


1692 


28 


0.899 


0.753 


1693 


20 


0.933 


0.655 


1694 


19 


0.990 


0.920 


1695 


17 


0.945 


0.731 


1697 


18 


0.885 


0.588 


1698 


29 


0.986 


0.937 


1699 


26 


0.972 


0.557 


1700 


17 


0.977 


0.946 


1701 


17 


0.882 


0.608 


1702 


20 


0.989 


0.952 


1703 


22 


0.919 


0.573 


1706 


31 


0.895 


0.648 


1707 


22 


0.965 


0,922 


1708 


22 


0.937 


0.569 


1709 


20 


0.980 


0.903 


1710 


17 


0.972 


0.857 


1711 


27 


0.984 


0.823 


1712 


17 


0.963 


0.872 


1713 


24 


0.977 


0.880 


1714 


17 


0.970 


0.908 


1715 


31 


0.973 


0.843 


i 1716 


18 


0.931 


0.703 


1717 


18 


0.931 


0.702 


1718 


34 


0.946 


0.628 


1719 


19 


0.973 


0.883 



Printed from Mimosa 05/1 1/28 15:58:21 Page: 420 



WO 03/080795 



PCT/US02/25485 



Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1720 


48 


0.980 


0.845 


1721 


28 


0.922 


0.676 


1722 


44 


0.965 


0.645 


1723 


26 


0.887 


0.730 


1724 


25 


0.939 


0.795 


1725 


15 


0.971 


0.942 




23 


0.923 


0.591 


177ft 


23 


0.987 


0.936 


1729 


18 


0.927 


0.814 


1730 


18 


0.935 


0.605 


1731 


25 


0.972 


0.912 


1732 


42 


0.972 


0.726 


1733 


20 


0.952 


0.798 


1734 


17 


0.975 


0.918 


1735 


15 


0.979 


0.877 


1736 


41 I 


0.933 


0.659 


173R 


17 


0.925 


0.746 


1730 


18 _j 


0.912 


0.764 


1741 


11 


0.953 


0.814 


1747 


23 


0.976 


0.774 ! 


1744 


23 


0.918 


0.606 


1746 


29 


0.915 


0.652 


1747 


15 


0.933 


0.840 


1748 

1 1 *tO 


27 


0.903 


0.612 J 




29 


0.904 


0.618 


1751 


22 


0.888 


0.670 


1757 


16 


0.979 


0.868 


1753 


26 


0.959 


0.884 


1754 


22 


0.954 


0.696 


1755 


20 


0.895 


0.707 


1756 


26 


0.906 


0.703 


1757 


14 


0.888 


0.587 


175R 


f 15 


0.994 


0.953 


175Q 


21 


0.922 


0.610 


1760 


21 


0.942 


0.693 


1761 

l / vl 


19 


0.947 


0.814 


1762 


21 


0.934 


0.655 


1763 


22 


0.940 


0.609 


1764 


23 


0.937 


0.832 


1765 


23 


0.896 


0.677 


1766 


26 


0.909 


0.690 


1768 


18 


0.915 


0.689 


1769 


1 36 


0.969 


0.602 


1770 


20 


0.880 


0.640 


1772 


20 


0.942 


0.715 


1773 


*~ 20 


0.947 


0.817 


1774 


16 


0.969 


0.880 


1775 


18 


0.971 


0.859 


1776 


24 


| 0.891 


0.670 


1777 


27 


0.961 


0. /4/ 


1778 


40 


0.963 


0.574 


1779 


23 


0.974 


0.656 


1780 


21 


0.899 


0.653 


1781 


25 


0.908 


r 0.601 


1782 


19 


0.943 


0.678 
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SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1783 


23 


0.936 


0.634 


1784 


29 


0.949 


0.786 


1785 


44 


0.915 


0.571 


1786 


22 


0.965 


0.885 


1787 


15 


0.974 


0.940 


1789 


• 23 


0.952 


0.659 


1790 


16 


0.972 


0.898 


1791 


21 


0.980 


0.953 


1792 


32 


0.961 


0.668 


1793 


29 


0.907 


0.551 


1794 


22 


0.957 


0.934 


1795 


21 


0.990 


0.849 


1796 


22 i 


0.954 


0.893 


1797 


16 


0.942 


0.657 


1799 


25 


0.949 


0.840 


1800 


28 


0.949 


0.739 


1801 


25 


0.938 


0.767 


1807 


15 


0.899 


0.672 


1801 

lOVJ 


17 


0.987 


0.956 


1804 


24 


0.941 


0.775 




26 


0.972 


0.771 


i oru: 
lol/u 


20 


0.985 


0.957 


1807 


22 


0.932 


0.571 


1808 

lOVO 


16 


0.927 


0.608 


1809 


26 


0.987 


0.770 


1810 


37 


0.955 


0.592 


1811 


28 


0.911 


0.632 


1812 


24 


0.894 


0.698 


1813 


22 


0.906 


0.624 


1814 


34 


0.951 


0.806 


1816 


25 


0.919 


0.578 


1817 


26 


0.980 


0.932 


1818 


19 


0.993 


0.940 


1820 


26 


0.939 


0.810 


1821 


48 


0.967 


0.556 


1822 


19 


0.931 


0.753 


1823 


36 


0.892 


0.670 


! 1824 


18 


0.903 


0.674 


1825 


17 


0.966 


0.854 


1826 


15 


0.938 


0.849 


1827 


27 


0.985 


0.891 


1828 


17 


0.895 


0.665 


1829 


36 


0.916 


0.620 


1830 


22 


0.952 


0.835 


1831 


17 


0.961 


0.731 


1832 


19 


0.996 


0.982 


1833 


19 


0.918 


0.556 


1834 


37 


0.926 


0.587 


1836 


14 


0.897 


0.787 


1837 


19 


0.960 


0.816 


1838 


31 


0.902 


0.632 


1839 


17 


0.987 


0.955 


1840 


23 


0.988 


0.941 


1842 


26 


0.915 


0.695 


1843 


26 


0.987 


0.926 
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1844 
1845 



1846 
1847 



422 
Table 6 



Position of Signal 
Peptide 



15 



16 



20 
18 



Maximum score 



0.933 



0.942 



0.914 



Average score 



0.731 



0.750 



1848 



24 
26 



0.899 



0.988 



0.695 



0.883 
0.612 



1850 



1851 



1853 



1854 



1855 



1856 



1857 



1858 



1859 



1860 



1862 



1863 



1864 



1865 



1867 



1868 



1869 



1870 



1871 



1872 



1873 



1874 



1875 



1876 



1877 



1878 



1879 



1880 



1881 
1882 
1883 
1884 
1886 
1887 
1888 
1889 
1890 
1891 

_1892 
1894 
1895 
1896 
1897 
1898 

_ 1899 

_190Q 
1901 

J902 
1903 



31 



22 



30 
24 



14 



19 



21 



20 



23 



18 



16 



21 



24 



37 
19 



37 



20 



18 



16 



16 



24 



33 



26 



20 



18 



27 



45 



16 



35 
. 23 
19 
26 
25 
21 
39 
20 
23 
26 
20 
16 
, 28 
19 
17 
19 
,22 
26 
16 
38 
26 



0.956 



0.961 



0.966 



0.921 



0.973 



0.938 



0.931 



0.908 



0.933 



0.920 



0.896 



0.887 



0.974 



0.982 



0.997 



0.960 



0.970 



0.950 



0.952 



0.921 



0.908 



0.991 



0.898 



0.904 



0.983 



0.951 



0.971 



0.966 
0.940 



0.926 

0.882 

0.933 

0.919 

0.911 

0.987 

0.965 

0.967 

0.980 

0.896 

0.882 

0.914 

0.997 

0.899 

0.893 

0.976 

0.952 

0.990 

0.985 

0.912 

0.952 



0.568 



0.882 



0.610 



0.922 



0.902 



0.745 



0.556 



0.837 



0.633 



0.737 



0.641 



0.937 



0.899 



0.901 



0.758 



0.851 



0.684 



0.694 



0.724 



0.579 



0.913 



0.689 



0.707 



0.967 



0.739 



0.862 



0.761 



0.778 



0.704 

0.567 

0.703 

0.754 

0.570 

0.931 

0.616 

0.885 

0.871 

0.665 

0.729 

0.741 

0.888 

0.777 

0.615 

0.821 

0.791 

0.775 

0.958 

0.654 

0.870 
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SEQIDNO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1904 


25 


0.949 


0.844 


1905 


23 


0.945 


0.718 


1906 


18 


0.907 


0.556 


1907 


20 


0.961 


0.786 


1908 


19 


0.907 


0.752 


1909 


17 


0.957 


0.808 


1910 


22 


0.933 


0.778 


1911 


22 


0.988 


0.913 


1912 


32 


0.964 


0.814 


1913 


21 


0.952 


0.784 


1914 


24 


0.946 


0.644 


1915 


21 


0.919 


0.644 


1916 


21 


0.969 


0.912 


1917 


16 


0.962 


0.681 


1918 


14 


0.926 4 


0.776 


1919 


23 


0.987 


0.897 


1920 


48 


0.987 


0.614 


1921 


23 


0.899 


0.677 


1922 


23 


0.907 


0.651 


1923 


16 


0.921 


0.706 


1924 


20 


0.928 


0.672 


1025 


26 


0.985 


0.942 


1026 


27 


0.911 


0.682 


1097 


19 


0.939 


0.700 


1028 


15 


0.887 


0.709 


1929 


15 


0.980 


0.959 




25 


0.987 


0.924 


1031 


28 


0.936 


0.745 


1932 


20 


0.958 


0.669 


1933 


21 


0.988 


0.945 


1934 


24 


0.912 


0.699 


1935 


23 


0.909 


0.726 


1036 


20 


0.964 


0.924 


1037 


28 


0.960 


0.813 


1938 


18 


0.971 


0.806 


1939 


20 


0.954 


0.746 


1941 


20 


0.986 


0.933 


1942 


45 


0.976 


0.736 


1944 


18 


0.967 


0.871 


1945 


20 


0.973 


0.759 


1947 


17 


0.954 


0.919 


1948 


21 


0.970 


0.871 


1949 


18 


0.991 


0.976 


1950 


27 


0.893 


0.647 


1951 


19 


0.881 


0.705 


1952 


24 


0.977 


0.830 


1953 


15 


0.957 


0.834 


1954 


29 


0.970 


0.863 


1956 


19 


0.940 


0.835 


1957 


32 


[ 0.992 


0.891 


1958 


22 


0.968 


0.837 


1959 


27 


0.908 


0.725 


1960 


20 


0.941 


0.751 


1961 


21 


0.885 


0.669 


1962 


29 


0.955 


0.797 
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Olli\£ XU iSlJl 


Position of Signal 
Peptide 


Maximum score 


AvPt*Qi7P crnrp 
age 9tUIc 


1963 


16 


0.974 


0.950 


1964 


21 


0.929 


0.745 


1965 


24 


0.913 


0.658 


1966 


45 


0.937 


0.671 


1968 


43 


0.956 


0.581 


1969 


19 


0.956 


0.614 


1970 


_ 46 


0.901 


0.566 


1971 


24 


0.947 


0.768 


1972 


24 


0.900 


0.642 


1974 


22 


0.988 


0.922 


1975 


24 


0.951 


0.710 


1976 


18 


0.932 


0.740 


1977 


18 


0.954 


0.736 


1978 


20 


0.994 


0.967 


1979 


26 


0.987 


0.926 


1980 


22 


0.964 


0.866 


1981 


13 


0.932 


0 870 


1982 


21 


0.949 


ft 8R1 

U.OO 1 


1983 


23 


0.957 


ft fiSR 


1984 


12 


0.954 




1985 


22 


0.990 


fi R9Q 


1986 


31 


0.987 




1987 


20 


0.919 


ft 791 


1988 


17 


0.985 


ft Q££ 


1989 


24 


0.966 


ft R^ft 


1990 


31 


0.971 


ft Rl fi 
U.O JO 


1991 


15 


0.935 


ft $29"* 


1992 


21 


0.967 


ft Rft9 


1994 


18 


0.930 


ft fiSft 


1995 


20 


0.902 


n fin 


1996 


23 


0.946 


0 794 


1997 


25 


0.943 


ft 787 

V. / o / 


1998 


18 


0.921 


0.666 


1999 


13 


0.883 


ft 74R 

v. / to 


2000 ! 


24 


0.899 


ft S7Q 


2001 


13 


0.918 


0.705 


2002 


18 


0.899 


ft RftQ 


2003 


18 


0.950 


ft fi47 


2004 


30 


0.981 


ft RfiQ 


2005 


17 


0.950 


ft 771 


2007 


24 


0.940 


ft Rftft 


2008 


21 


0.980 


ft Rl ^ 


2009 


43 


0.939 


ft fiS<\ 


2010 


16 


0.920 


ft fiQR 


L 2011 


30 


0.978 


ft Oft 1 


2012 


19 


0.981 


ft 010 


2013 


40 


0.978 


ft 


2014 


20 


0.994 


0.960 


2015 


18 


0.955 


0.771 


2016 


25 


0.914 


0.769 


f 2017 


31 


0.952 


0.776 


2018 


26 


0.985 


0.854 


2019 


16 


0.945 


0.822 


2020 


22 


0,973 


0.804 


2021 


17 


0.954 


0.919 
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Position of Signal 
Peptide 




A VPfOno cnnro 

/*rcrdge »curt 


2022 


19 


0.993 


ft 97? 


2023 


18 


0.921 


0 6R? 


2026 


23 


0.890 


0 6ftd 


2027 


35 


0.943 


ft 6ft? 
U.OUj 


2028 


25 


0.992 


ft OS? 


2029 


47 


0.950 


ft R46 


2030 


17 


0.914 


ft 779 


2032 


18 


0.995 


ft 074 


2033 


17 


0,933 


ft R9R 


2034 


17 


0.934 


A 64>d 


2035 


26 


0.910 


a S67 

U.JO/ 


2036 


30 


0.940 


ft 6Qft 

u.oyu 


2037 


23 


j 0.908 


ft SS7 


2038 


18 


0.906 


ft 694 


2039 


18 


0.926 


ft 76R 
U. /Oo 


2040 


14 


0.934 


ft 7SR 
U. / JO 


2041 


18 


0.960 


ft R6Q 


2042 


21 


0.911 


ft 71 6 
U. / 10 


2043 


25 


0.896 


ft S76 
U.J /O 


2044 


27 


0.953 


U.ojU 


2045 


17 


0.962 


U.603 


2046 


25 


0 094 


A <T> 

U.J /z 


2047 


39 


ft OSS 
l/.^J J 


A £AO 

U.oOo 


2048 


38 


ft OSR 




2049 


25 


0 04Q 


u.o03 


2050 


27 


ft o?9 


l>. AZo 


2051 


15 


o Oftft 


U.O/2 


2052 


22 


ft 067 


A HAI 
U. /U3 


2053 


19 


n 06ft 


U. / J/ 


2054 


20 


n Rko 


U. //j 


2055 


19 


non 


U. /4i 


2057 


23 




A OO 


2058 


23 


ft R9? 


U. /iJo 


2059 


26 


ft 0<rt 


a rift 

U.619 


2060 


19 


ft 0?S 1 


A *7TA 


2061 


44 


ft 0S9 




2062 


31 


0 964 f 




2063 


19 


0 924 


u. /uv 


2064 ' 


18 


0.891 


U.O/j 


2065 


25 


0.912 


ft 764 
U. /0*f 


2067 


25 


0.954 


ft 819 


2068 


20 


0.913 


ft 6RS 


2069 


40 


0.974 


ft 6R6 


2070 


28 


0.991 1 


ft RQ6 


2072 


18 


0.956 


ft S44 
U.544 


2073 


26 


0.928 


ft 741 


2074 


17 


0.902 


0.678 


2075 


18 


0.965 


0.850 


2076 


27 


0.975 


0.937 


2077 


32 


0.988 


0.863 


2078 


29 


0.922 


0.662 


2080 


20 


0.986 


0.918 


2081 


13 


0.969 


0.953 
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SEO ID NO» 


ChromsomaJ location 


1 


z 


2 


c 
J 


3 


1" 1 


4 


c 

J 


5 


1 c 

Id 


5 


A 

4 


7 


iz 


0 
0 


A 

4 


r g 


ID 






1 1 

1 J. 


0 


19 


1 f\ 


X J 




14 
it 


0 




1 A 
10 


1 u 


12q 


17 


1 


10 


2 


Zw 


X 


71 
Zl 


4 


91 
Zj 


12 


Z4 


11 


7<; 


1 


7fi 
ZD 


16 


77 
Z / 


8 


Zo 


1 


70 

zy 


11 


in 


3 


11 
j i 


2 


^7 


1 


11 

jj 


17 


^4 


4 


ju 


X 


jO 


2 


10 


16 


41 


19 


47 

, fZ 


4 


41 


8 


44 


4 


4S 


19 


4fi 

*tU 


18 


47 


6 


48 


9 


40 


10 


S7 


11 ] 




1 o 

18 


S4 


17 


55 


17 


56 


5 


57 


21 


59 


4 


60 


10 


61 


18 


u_ 63 


4 


64 


11 


65 


20qll.21-11.23. 
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1 hfAltlCAflU'lI lA^Q^IAn 


66 


1 5 

ID 


68 


1 1 
X I 


70 




71 


i y 


72 


1 X 


75 


1 
1 


77 


Z 


78 


1 
J 


79 


n 
1 


80 


"5 
J 


81 


1 

X 


82 


1 7 


83 


OpX l.Z-lZ.,3 


84 


1 
X 


85 


/I 
*f 


86 




87 


xz 


88 


o 


90 


z 


92 


0 


95 


1 r 

1j 




xu 


97 


4 


98 


14q31 


90 

yy 


1 


inn 


5 




2 


in? 

1UZ 


4 




4 


IfU 
1 u*t 


19 


105 


I X 




3 


109 


XU 


111 


A 


114 
i i*t 


V 

A 


1 1 j 


Z 


116 

11U 


1 


117 


c 


118 


y 


120 


z 


121 


xy 


123 


z 


124 


1 A 1 
XU 


125 


r I 
J 


! 126 


A 


128 


X 


130 


•j 
3 


131 




• 135 


y 


136 


16 


, 137 


17 


138 


2 


1 139 


2 


140 


6ql6.1-16.3. 


142 


9 j 


143 


20 
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seo m NO* 


-} 

i Chromsomal location 


145 


o 
O 


146 


22ql3. 


147 


i 


ito 


6 


149 


1 


151 


K 

O j 




6 ] 




2 | 


1^ 


4 i 


1 ^6 


17 


1 *J7 


17 ] 


1 SR 
i Jo 


11 | 




11 1 


16ft 


16 | 


161 


1 




17 j 




1 j 


164 

!Of 


5 j 


16^ 


15 | 


1 66 
100 


3 


lOo 


9 


ioy 


l 6 1 


1 7fk 
1 /U 


16 1 


1/1 


1 1 


17? 
1 /Z 


4 "J 


17/1 


10 J 


17^ 

1 / J 


S | 


176 "> 
1 /O 


6 


177 


15 i 


17R 
1/0 


6 


170 
i /y 


9 j 


1Rfl 

10U 


9- \ 


1R1 
lol 


2 J 


1 R7 


6 | 


1R1 


2 ! 


1R5 


11 ] 


1 R6 
150 


11 _J 


1 Rfl 
loo 


18 


1 RO i 


11 ! 


ion 


9 1 


101 
1^1 


10 j 


107 


4 


101 


Xql3.2-21.1 i 


104 r 


10 j 


106 


20 "J 


107 


10 j 


198 


6 


199 


11 | 


201 


1 1 i 
11 


!"~ 203 


X _J 


206 


8 ! 


207 


11 


208 


19 


209 


15 _J 


210 


3q 


211 ! 


6q25.1-26 1 
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mlq ID ivu: 


Chromsomal location 


Of) 
21/ 


y 


214 


1 Q 

iy 


21 D 


2U 


on 


1 


21o 


oo«i aii ii t) 
22ql3.31-13.33 




t 
1 


220 


2 


221 


3 


III 


9 


llj 


IS 


22j 


3p 


220 


lo 


228 


4 


zzy 


17 


ITA 

23 U 


1 / 


231 


• 
1 


232 


19 


234 


11 


23j 


19 


238 


3 


239 


6 


241 


11 


242 


10 


243 


15 


244 


4 


245 


21 


246 


19 


248 


6pl2.3-21.2 


1 249 


3 


250 


1 


2 J 1 


20 


OCT 

2->2 


16q24.3 


2D3 


19 


234 


14 


ZJJ 


9 


OCT 

Z3 / 


2 


oco 
ZJ5 


11 


259 


17 


zoU 


19 


261 


8 


202 


3 


203 


o 

8 


204 


16 


203 


9q34.2-34.3 


200 


10 


20/ 


17 


20o 


4 


2oy 


3p 


OOA 

2/U 


9q 13-2 133 


271 


i 

i 


272 


8 


273 


19 


275 


17 


279 


3q 


280 


15 


281 


6 
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SEQ ID NO: 


Cnroinsomai location 


282 


n 
i / 


283 


i n 


285 


1 c 

ID 


286 


<r 
D 


289 j 


1U 


290 


y 


292 


/ 


293 


o 
8 


294 


1 o 

18 


296 


4 


297 


15 


298 


15 


299 


1 A 

10 


300 


7 


301 


C 
-> 


302 




304 


1 


305 


Xq25-2o.2 


306 


1 o 

18 


307 


2 


308 


17 


309 


1 


310 


12 


311 


20 


313 


18 


314 


11 


315 


14 


316 


6 


317 


10 


318 


10 


319 


19 


320 


9 


321 


6 


322 


10 


323 


3 


324 


10 


325 


1 


326 


16 


327 


6 


328 


X 


330 


+ A 

4 


331 


Z 


332 


14 


333 


2 


334 


2 


336 


ZlqzZ.i 


337 


9 


338 


iy 


339 


15 


1Af\ 


4 


341 


9 


342 


10 


343 


19 


344 


5 


346 


16 


349 


3 
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SEQ ED NO: 


Chromsomal location 


350 


11 


352 


17 


353 


18 


354 


20 


356 


3 


357 


5 


358 


11 


359 


9 


364 


2 


365 


4 


366 


7 


367 


5 


369 


8 


370 


4 


371 


6ql5-16.1 


372 


19 


374 


2 


375 


12 


376 


17 


377 


1 


379 


19 


380 


9 


381 


6. 


382 


9 


383 


18 


384 


18 


385 


3 


387 


1 


388 


21 


389 


17 


390 


17 


391 


4 


1 393 


10 


394 


11 


395 


11 


396 


10 


397 


16 


398 


13 


400 


3 


402 


2 


403 


Xq28 


406 


1 


407 


19 


408 


8 


409 


4 


410 


3 


411 


4 


412 


5 


413 


22ql2.3-13.1 


414 


8 


416 


8 


417 i 


20pl2.2-13 


418 


10 


420 


4 


421 


8 


423 


11 
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SEQ ID NO: 


Chromsomal location 


424 


17 


425 


17 


426 


17 


427 


17 


428 


4 


429 


2 


430 


3 


431 


19 


432 


18 


433 


12 


434 


17 


435 


6 


436 


2 


438 


1 


. 439 


8 


441 


1 


442 


2 


443 


11 


444 


2 


446 


11 


447 


19 


448 


11 


449 


19 


450 


3 


452 


3 


453 


5 


455 


17 


457 




459 


18 


460 


18 


461 


14 


462 


5 


463 


11 


464 


3 


465 


2 


466 


11 


467 


13 


470 


19 


471 


6p24.1-25.3 


473 


4 


474 


15 


475 


13 


478 


8 


479 


10 


480 


15 


481 


9 


482 


lq23.1-24.1 


483 


8 


484 


17 


486 


15 


487 


22qll 


488 


3q 


489 


1 


490 


3 


492 


11 


493 


lp36.2-36.3 
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SEQ ID NO: 


Chromsomal location 


495 


10 


496 


19 


497 


18 


498 


22ql3 


499 


5 


501 


6 


503 


1 


504 


10 


505 


20 


506 


3 


507 


18 


508 


8 


509 


1 


510 


2 


513 


6q25.2-26 


514 


6 


517 


3 


518 


5 


519 


12 


520 


13 


521 


12 


522 


15 


523 


15 


524 


8 


525 


15 


526 


15 


528 


4 


530 


8 


531 


11 


532 


4 


533 


17 


534 


3 


535 


18 


536 


18 




15 


538 


13 


539 


8 


540 


X 


542 


2 


543 


5 


544 


Xq25. 


546 


11 


547 


22ql3.2-13.33. 


549 


13ql2-13 


550 


1 


552 


6q23 


553 


19 


554 


1 


555 


17 


556 


7 


558 


11 


559 


8 


560 


12 


561 


10 


563 


19 


564 


10 



Printed from Mimosa 05/11/28 15:58:38 Page: 434 



WO 03/080795 



PCT/US02/25485 



Table 7 



i SEQ ID NO: 


Chromsomal location 


565 


17 


566 


9 


567 


1 


568 


Xq22.2-24 


569 


3 


570 


1 


571 


5 


573 


6q22.1-22.33 


574 


15 


575 


17 


576 


5 


577 


5 


578 


11 


581 


22ql2 


582 


16 


584 


6q25.3-26 


585 


3 


586 


11 


587 


2 


588 


2 


589 


15 


590 


11 


591 


11 


593 


Xpl 1.3-21.1 


594 


22 


595 


9 


596 


11 


597 


10 


598 


11 


599 


12 


601 


9 


602 


16 


603 


12 


604 


8 


605 


6 


606 


11 


607 


10 


608 


1 


609 


3 


610 


5 


611 


3 


612 


6 


1 613 


10 


614 


17 


615 


11 


616 


6 


617 


16 


618 


11 


620 


18 


621 


17 


622 


17 


624 


22 


625 


3 


626 


19 


627 


11 


629 


3 
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SEQ ED NO: 


Chromsomal location 


630 


3 


631 


17 


632 


6 


634 


2 


635 


10 


636 


12 


637 


6 


639 


8 


640 


5 


641 


11 


642 


4 


643 


7 


644 


20pl2.1-13. 


646 


15 


647 


2 


648 


16 


649 


8 


650 


4 


651 


13q 12.1 1-12.2 


652 


10 


! 654 


1 


655 


Xp 


656 • 


3 


657 


13 


659 


1 


660 


18 


661 


22 


662 


X 


663 


15 


664 


18 


665 


4 


666 


4 


667 


5 


671 


11 


672 


18 


674 


19 


675 


17 


676 


17 


677 


10 


678 


10 


679 


4 . 


680 


8 


681 


5 


682 


4 


683 


6 


684 


1 


686 


11 


687 


5 


689 


9 


690 


A 
*f 


691 


4 


692 


5 


693 


1 


694 


16 


695 


19 


696 


12 
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SEQ ID NO: 


Chromsomal location 


697 


11 


698 


11 


699 


10 


702 


5 1 


704 


16 


705 


1 3 


707 


3 


708 


10pll.21-12.1 


709 


11 


710 


10 


711 


10 


712 


10 


714 


3 


715 


6q25.3-26 


716 


8 


718 


X 


719 


17 


721 


6 


722 


16 


723 


2 


724 


12 


725 


16 


726 


19 


727 


3 


728 


16 


729 


6 


730 


16 


731 


7 


732 


11 


733 


8 


734 


9q21.1 1-21.2 


735 


17 


736 


5 


737 


1 


738 


1 


739 


1 


740 


Xq22.3-24 


741 


17 


743 


7 


744 


15 


746 


12 


747 


1 


748 


19 


749 


5 


750 


9 


751 


5 


752 


9 


753 


19 


754 


15 


755 


8 


756 


X 


757 


3 


758 


lpl2-13.3 


760 


6 


761 


19 


762 


8 
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SEQroNO: 


Chromsomal location 


! 763 


12 


764 


2 


765 


11 


766 


11 


767 


15 


768 


17 


769 


11 


771 


11 


772 


17 


773 


5 


774 


18 


775 


1 


777 


8 


778 


16 


781 


16 


782 


1 


783 


21 


784 


6p2 1J2-22.1 


785 


5 


787 


16 


788 


7 


789 


15 


790 


22 


791 


6 


792 


1 


793 


22 


794 


8 


795 


2 


796 


1 


799 


6 


800 


9 


802 


9 


803 


17 


804 


10 


805 


3 


806 


2 


807 


14 


810 


6 


811 


10 


812 


16 


813 


1 


815 


16 


817 


3 


818 


15 


819 


Xq22.3-24. 


821 


1 


822 


6ql6.1-21. [ 


823 


17 


825 


10 


826 


15 


827 


3 


828 


17 


829 


22ql3.33. 


830 


11 


832 


15 


833 


9q31.3-33.2 
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Chromsomal location 


834 


15 


835 


r X 


836 


11 


837 


19 


838 


10 


839 


2 


840 


1 


841 


8 


842 


4 


843 


1 


845 


16 


848 


19 


849 


10 


851 


2 


853 


10 


856 


2 


857 


1 


858 


5 


859 


2 


860 


19 


861 


3 


862 


2 


863 


11 


864 


L 3 


865 


3 


866 


| 21 


867 


lq42. 11-42.3 


868 


1 


870 


8 


871 


6 


872 


1 


873 


12 


874 


6q27 


876 


11 


877 


2 


878 


19 


880 


3 


881 


1 


885 


8 


886 


9 


887 


5 


888 


9 


891 


16. 


892 


10 


893 


21 


894 


5 


895 


5 


896 


4 


897 


13 


898 


18 


899 


10 


900 


16 


901 


3 


902 


11 


903 


1. 


904 


13 
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cnromsomai locanon 


yio 


1Q 


n/v7 

yu/ 


in 
XU 






yoy 


X 


m 1 

yi i 
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-I 


nil 

y \2 


c 
J 


nil 

y ii 


1 £ 


yi4 


i 


yi j 


Q 
O 


yio 


1 1 

X X 


917 


1 *7 
X / 


918 


1 £ 

xt> 


919 


xy 


920 


/ 


922 


y 


924 


1 A 

1U 


925 


1 1 

X X 


926 


XI 


928 


1 


929 


1 


930 


12q 


931 


18 


932 


15 


933 


15 


934 


15 


935 


lp35.2-36.13. 


937 


11 


938 


1 


939 


15 


940 


A 


942 


1 X 


943 


i 
X 


944 


y 


946 


5 


947 


4 


949 


12 


951 


A 

4 


952 


xo 


953 


f 1 

IX 


956 


c 
o 


957 


xy 


959 


Xo 


960 


/r 
O 


962 


X6q24.J 


963 


y 


964 


a 
O 


965 


XqXZ 


966 


X X 


967 




70/ 


17 


970 


10 


972 


10 


973 


Xql2 


974 


lp36.1 1-36.33 


976 


2 


977 


20 
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SEOIDNO: 


Chromsoraal location 


979 


2 


980 


8 ] 


981 


19 


984 


6 


985 


5 


987 


18 


988 


3 


989 


11 


990 


3 


991 


2 


992 


17 


993 


10 


994 


12 


995 


lp34.1-36.11 


996 


14 


997 


20pl2.2-13 


998 


2 


1000 


12 


1001 


1 


1002 


X 


1005 


17 


1006 


lp31.2-32.1 


1007 


15 


1008 _J 


15 


1009 


2 


1010 


13 


1011 


6 


1012 


18 


1013 


1 


1015 


6 


1016 


5 


1017 


12 


1018 


5 


1019 


CITB-H1 2291F22 


1020 


4 


1021 


18 


1022 


1 


1023 


11 


1024 


1 


1025 


3 


1027 


19 


1028 


2 


1030 


3 


1031 


4 


1032 


1 


1033 


3p 


1034 


X 


1035 


1 


1036 


1 


1038 


13 


1041 


3 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 


2535 


C 


328 


546 


MMRRPVHCATDKEGILAPKHFQAAAGEA 
RTSTDRSGAQAQRSVTPCQWHSVQDSSTY 
SSWWWAAAAETL 


2536 


A 


163 


699 


PADAPSLAAFPGDPQYDPPYCYru 1 Kl^ W v 

PGEGMLLLSQTLCLGEQVLLGAWLVWGPS 

RDPRPLPYLCHDEPYTFDINLSVNLKGPGN 

RLGEPIPISKAHEHIFGMVIMNDWSGNYW 

SSVPVKMTGKELGTWGNFIKAEDWCRSK 

GAVMALPRAVTPTRArNTESTIGAAGVDNE 

VSSTG 


2537 


A 


1415 


3050 


NHKSPMALPYHIFLFTVLLPSFTLTAPPPCR 

CMTSSSPYQEFLWRMQRPGNIDAPSYRSLS 

KGTPTFTAHTHMPRNCYHSATLCMHANT 

HYWTGKM1NPSCPGGLGVTVCRTYFTQTG 

MSDGGGVQDQAREKHVKEATSQLTRGHST 

PSPYKGLVLSKLHETLRTHTRLVSLFNTTL 

TGLHEVSAQNPTNCWICLPLNFRPYVSIPV 

PEQWNNFSTEINTTSVLVGPLVSNLEITHTS 

NLTC VKF SOTTYTTNS QCIRWVTPPTQI VC 

LPSGIFFVCGTSAYRCLNGSSESMCFLSFLV 

PPMTIYTEQDLYSYVUS^SPRNKRVPILPFVI 

GAGVLGGLGTGIGGITTSTQFYHKLSQELN 

GDMEQVA\DS\LVTIXJDQLNSLy^AVVLQJN 

RRAIJDLLTAERGGTCLLLGEECCYnrVNv^^ 

G1VTEKVKEIRDRIQRRAEELKNTGPWGLL 

SQWMPWILPFLGPLAAnLLLLFGPCIFNLL 

VOTVSSRIEAVKLQMEPKMQSKTKIYRRPL 

DRPASPRSDVNDIKGTPPEEISAAQPLLRPN 

SAGSS 


2538 


B 


67 


1280 


XYCRVPTYFHMTPYEGTTST 


2539 


A 


393 


1 


GGIGRGGGAGGGVGAAGSASGGVGRRGA 
GGVIADSGAPGGGVEGGVGASGGWRE/GR 
GTSGGVGGSGG ACGS V/ GGSGG AOOO V O 
ACGSTSDGVGRSRGTIGGLGGSGSAGGGV 
GACGGASGYVGIRGAGGG 


2540 


A 


2 


370 


Al^PLLEQVEIJ'AVASVbAbVlK^^ 
VSWPPPLLLPAATTRSNSTSMHSS1PSIENK 
PPQAIVKPQILTHV1EGFVIQEGLEPFPVSRS 
SLLIEQPVKKRPLLDNQVTNSVCVOPEL 


2541 


A 


50 


247 


MWSAHPI^VLSLKLTLFSLTSDWLSSKDM 
AISI^FKISQILCSVI^APGKRLISVLWOTSS 
LKRS* 


2542 


A 


130 


3995 


HPLDOTniXAAGFLGLRTVGVTKAWRSG 

WLRFPAAMFLYNLTLQRATGISFAfflGNFS 

GTKOOEIVVSRGKIL\ELLRPDPNTGKVHTL 

LTVEWGVIRSIJvLAFRLTGGTKDYIVVGSD 

SGRIVD^YQPSKNMFEKfflQETFGKXSGGR 

STVPGQFLAVDPKGRAVMISAIEKQKLVYI 

LMU)AAARLTISSPLEAHKAKTLVYHVVG 

VDVGFENPMFACLEMDYEEADNDPTGEA 

AANTQQTLTFYELDLGLNHWRKYSEPLE 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^Unknown, *=rStop 
codon,/=possible nucleotide 
deletion,=possible nucleotide insertion) 










EHGNFLITVPGGSDGPSGVUCSENYITYKN 

FGDQPDIRCPIPRRRNDLDDPERGMEFVCSA 

THKTKSMFFFLAQTEQGDIFKITLETDEDM 

VTEIRLKYFDTVPVAAAMCVLKTGFLFVA 

SEFGNHYLYQIAHLGDDDEEPEFSSAMPLE 

EGDTFFFQPRPLKNLVLVDELDSLSPILFCQ 

IADLANEDTPQLYVACGRGPRSSLRVLRH 

GL^VSEMAVSELPGNPNAVWTVRRHIEDE 

FDAYITVSFVNATLVLSIGETVEEVTDSGFL 

GTTPTLSCSLLGDDALVQVYPDGIRHIRAD 

KRVNEWKTPGKKTTVKCAVNQRQWIALT 

GGELVYFEMDPSGQLNEYTERKEMSADV 

VCMSLANVPPGEQRSRFLAVGLVDNTVRII 

SIJ)PSDCLQPI^M\QA\LPAQPES\LCIVEMG 

\GT*KQDELGERGSIGFLYLNIGLQNGVLLR 

TVLDPVTGDLSDTRTR\YLGSRPVKLFRVR 

MQGQEAVLAMSSRSWLSYSYQSRF\HLTP 

LSYETLEFASGFASEQCPEGIVAISTNTLRIL 

ALEKLGAVFNQVAFPLQ\YTPRK\FVIHPES 

NNLniETDHNAYTEATK\A\QRKQQMAEE 

MVEAAWEDERDL\AAEMAAAF\LNENLPE 

SIFGAPKAGNGQLASV1\RVMNPIQGEHT\V 

TI^SLEQN\RAAF\SVAVCRFSNTGDDWYV 

LVGVPKDULNPRSVAGGFVYTYKLVNNG 

EKLEFLHKTP VEE VP AAIAPFQGRV LIG VG 

KLLR\VY\D LGKEGS YFRKC * ELRHI AN YI\S 

GDPDYSGHRVIVSDVQEKFHPGFRYKRKL 

KTKXHF ADD WP\RWVHYRP AS WD YDTV 

GWGQDKFRPTYVWVRLPTLTP1DEVR7DE 

DPTGNKSPVGTRGLAQMGGLPRKAEVITEL 

THVG\ET\VLSLQKTT\LIPGRLQNSLVLLPP 

CTGGIG\ILVPF\TSHE\DH\DFFQH\VE\MHLR 

\SEHPP\LCGGGDHL\SFRS\YYFPCEGM * LM 

GDLCE\QFNSM\EPNKQKERLLKELGPEPPP 

RSVPRKFEGYSGTRYGF 


2543 


A 


68 


425 


SHILPGAPGAPAWWTRWPSTLPEPFPRGRG 
SPAGTSPISRPGLVQSS*ASRGSDSRLPV/GP 
ASCQASGPGPDSRRPPPCTPA\GPHHGSLPS 
AGRVGASAAAAGPPSPAVPLPPAERPAP 


2544 


A 


1 


1982 


D AERQE ALGIVRRIGTDTEAATEP AG AT VP 

AAAAAARIGTVGPQPPAMPRRKRNAGSSS 

DGTEDSDFSTDLEHTDSSESDGTSRRSARV 

TRSSARLSQSSQDSSPVRNLQSFGTEEP\AY 

STRRVTRSQQQPTPVTPKKYPLRQTRSSGS 

ctc n\nmror»T) tnrvxTT A FlTTDF QPPT? TPTGN 
JtlX cU V VDr^UlvD 1 ruN ij\LtniJEtarrs\.Lr iui> 

APSSESDroiSSPNVSHDESIAKDMSLKDSG 

SDI^HXRPKRRRFHESYNFNMKCPTPGCNS 

LGHLTGKHERHFSISGCPLYHNLSVADECK 

VRAQ\TRDKQIEERMI^\HRQDDNNRH\AT 

RHQAPTERQLRYKEKVAELKKKRNSGLSK 

EQKEKYMEHRQTYGNTREPIXENLTSEYD 
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Table 8 



SEQ 

ED 

NO: 


Method 

• 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

deletion -possible nucleotide insertion) 










LDLFRRAQARASEDLEKLRLQGQITEGSN 

MIKTIAFGRYELDTWYHSPYPEEYARLGRL 

YMCEFCLKYMKSQTILRRHMAKCVWKHP 

PGDETYRKGSISVFEVDGKKNKIYCQNLCL 

LAKLFLDHKTLYYDVEPFLFYVMTEADNT 

GOILIGYFSKEKNSFLNYNVSCILTMPQYM 

RQGYGKMLEDFSYLLSKVEEKVGSPERPLS 

DLGLISYRSYWKEVLLRYLHNFQGKEISIK 

EISQETAVNPVDIVSTLQALQMLKYWKGK 

HLVLKRQDLE)EWIAKEAKRSNSNKTMDP 

SCLKWTPPKGT 


2545 


A 


95 


719 


VWPEVTDPEKFVYEDVAIAAYLLILWEEE 

RAERGLTARQSFVDLGCGNGLLVHELSSEG 

HPGRGID VRRRXIWDMYGPQTQT. .BED AITP 

NDKTLFPDVDWLIGNHSDELTPWIPV1AAR 

SSYNCRFFVLPCCFFDFIGRYSRRQSKKTQ 

YREYLDFIKEVGFTCGFHVDEDCLRIPSTTR 

VCLVGKSRTYPYSIEASVDEKRTQYIKS 


2546 


B 


224 


429 


XPFLILLLSPVSTDQANTTTAEIHSQLTPRL 
NLTILSSQGASLQQRVTYHRNHKYGQTHP 
QKAEIWG 


2547 


A 


59 


335 


GLAAGLPETLHISYCMTVFRFESLDSGVWT 
DDHSEACRNMHVLSVWTASCKAEPNPIWP 
HHPWLSCATWPC WKGFDLPGICFTALS CP 
KIYA 


2548 


A 


1 


1605 


PMYLFLCPPLALVQCALKDPRSKYSLGGR 

TTLIITLQGSG KKNNIPHP S SLSERVMTAKD 

GFVSRCHLLMQPKQQKWSLMYPMEGEVL 

ENGCWPTLQDSLLCTALVDKLLVFLGRCF 

CTAVEWMLVTCRTAAAVSAFLIVGRVSS 

PVCRAVSVQPWTLTADHTPGRYCLKLVCR 

QLCLCPSSTPLTEVFCSKEAFFIILDCSNLPH 

ALLPVDSPKGLSKCSNPREKARRKLQGHY 

HVASEVSFVPVRRFPKGEIGANQPGTHRKF 

YHLTHYRQNLKQPDVPHGRIVFDDKDITD 

WQTAKIMREAVAIVPEGRRVFSRMTVEEN 

LAMGGFFAERDQFQERIKWVYELFPRLHE 

RRIQRAGTMSGGEQQMLAIGRALMSNPRL 

LLLDEPSLGLAPmQQEFDTIEQLREQGMTIF 

LVEQNANQALKIADRGYVLENGHVVLSD 

TGDALLANEAVRRGDELTEDRSRSLDGELI 

RSLPCGASYGGLSLRPWSRGHIPQSHQSSE 

SVRVMFINTSKGASIISSSATMPGPLPKHLG 

P 


2549 


B 


1 


597 


MHVQGKAAILGRHFSISSLLPGALLLLTVIK 

GHTHPEEKSPGAHEKAVTGEPKCLGALPY 

CD S GGKJCATKKKD AGEMRSR1KD G VL VL 

KCISLQVGLASWTVSWLRTEATGYTFALLP 

PGTHHTEQTPSKHEQNGAELFCNCVSCFED 

PCPCQVPGTQPGNRLSEEHQASSQADVTNS 

SAPKQPHPPPAPCKGVCSHC 
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Table 8 



SEQ 


Method 


Predicted 


Predicted 


Amino acid sequence (X=Unknown, *=Stop 


ID 




beginning 


ending 


codon, /^possible nucleotide 


NO: 




nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


deletion,=possibie nucleotide insertion) 


2550 


A 


278 


451 


MAGTAQLLGLKQUGLELLTAQCGQITGY 
RDRREELLPPRFLATGPPSCHPPSQTVP* 










2551 


A 


1 


6530 


MWGSDRLAGAGGGGAAVTVAFTNARDCF 

LHLPRRLVAQLHLLQNQAIEWWSHQPAF 

LSWVEGRHFSDQGENVAEINRQVGQKLGL 

SNGGQELHAVSLEQHLU)QIRIVFPKAIFPV 

WVDQQTYIFIQIVALIPAASYGRLETDTKLL 

IQPKTRRAKENTFSKADAEYKKLHSYGRD 

QKGMN1KJELQTKQLQSNTVGITESNENESEI 

PVDSSSVASLWTMIGSIFSFQSEKKQETSW 

GLTEINAFKNMQSKWPLDNIFRVCKSQPP 

SIYNASATSVFHKHCAfflVFPWDQEYFDVE 

PSFTVTYGKLVKLLSPKQQQSKTKQNVLSP 

EKEKQMSEPLDQKKIRSDHNEEDEKACVL 

QWWNGT .RET .NNAIKYTKNVEVLHLGKV 

WPKDISEEDIKTVFYSWLQQSTTTMLPLVI 

SEEEFIKLETKDGPSRSYGKRRKQGVNSLG 

VSSLEHITHSLLGRPLSRQLMSLVAGLRNG 

ALLLTGGKGSGKSTLAKAICKEAFDKLDA 

HVERVDCKALRGKRLENIQKTLEVAFSEA 

VWMQPSWLLDDLDLIAGLPAVPEHEHSP 

DAVQSQRIAHALNDMIKEHSMGSLVALIA 

TSQSQQSLHPLLVSAQGVHIFQCVQHIQPP 

NQEQRCT1I.CNVIK^KLDCDINKFTDLDLQ 

HVAKETGGFVARDFTVLVDRAIHSRLSRQ 

SISTREKLVLTTLDFQKALRGFLPASLRSVN 

LHKPRDLGWDKIGGLHEVRQILMDTIQLP 

AKYPELFANLPIRQRTGILLYGPPGTGKTLL 

AGVIARJESRMNFISVKGPELLSKYIGASEQ 

AVRDIFIRAQAAKPCILFFDEFESIAPRRGH 

DNTGVTDRWNQLLTQLDGVEGLQGVYV 

LAATSRPDLIDPALLRPGRLDKCVYCPPPD 

QDGSSSSDSDLSLSSMVFLNHSSGSDDSAG 

DGECGLDQSLVS1JBMSEILPDESKFNMYRL 

YFGSSYESELGNGTSSDLEDESMNQPGPIK 

TRLAISQSHLMTALGHTRPSISEDDWKNFA 

ELYESFQNPKRRKNQSGTMFRPGQKFFDEI 

TELTYLPSFHHKAAPHQAEPGPNSSSASAP 

PPYNPFUSSPHTQSGLQFRSVTSPPPSAQQF 

PLKEVAGAKGIVKTALETAPTLALPVSSQP 

FSLHTAEVQGCAVGILTQGPGPCPVAFLSK 

QLDLTVLGSPSCLHAVASAALILLEALKJT 

NYAQLTLYSSTINFQNLFSFSHLTHILSAPRL 

LQLYSLFVESPTTTILPGPDFNLASHIILDTTP 

DPDDCMSLIYLTFTPFPHISFFSVPHVDHIW 

FTDGSSTRPDRHSPAKAGYAIESSTSI1EAT 

ALPPSTTSQQAELIALTRAFTLAKGLHVNIY 

TDSKYAFHILHHHAVIWAERGFLTTQGSSn 

NASLTKTLLKAALLPKEAGVTHCKGHQKA 

SDPITLGNAYADKGVRCAPDPARRPLPLPI 

GLKACHCSCTAKIGGKYRALVGOLKTISV 
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Table 8 



SEQ 
ID 

NO: 



Method 



Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 



Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 



Amino acid sequence (X-Unknown, *=Stop 
codon, /=possibIe nucleotide 
deletion,=possible nucleotide insertion) 



ATGLKTQDRTIDGSSQVIEEKNHNGYSVID 
TGTLVEAELEKLPNNWSPQTCELFALSQAL 
KYLQNQKTIS1LI QKEP SPALGLTPERKGNV 
GHAGKGPLESSSPDPFLCGQERREKGCRTA 
TSVS1TNPIMIGPWVVTHPGKELTPEHKGN 
VGHAGRDILAKAGAIIHLNIGEGTPVCCPL 
LEEGINPEVWATEGQYGRAKNARPVQVKL 
KDSTSFPYQRQYPLRPKAQQGLQKIVKDL 
KAQGLVKPCSNPCSTPILGVQKPNRQWRVT 
LCHQATQALFNFLATCGYMVSKPKAQLCS 
QQ/RYLGLKLSKGTRALSEEHIQPILAYPHP 
KTLKQLRGFLGVIGFCRKWIPRYGEIARSL 
NTLIKETQKj^NTHLVRWTTEVEVAFQALT 
QAPVLSLPTGQDFSSYVTEKTGIALGVLTQI 
RGMSLQPVAYLTKEIDWAKGWPHCLRV 
VAAVWLVSEAVKIJQGRDLTVWTSHDV 
NGELTAKGDLWLSDNHLLKYQALLLEGPV 
LRLCTCATLNPATFLPDNKEKIEHNCQQV1 
VQTYAAQGDPLEVPLTDPDLTLCTDGSSFV 
EKGLRKVGYAWSDNGILESNPLTPGTSAQ 
LAELIALTWALELGEEKRAN1YTDSKYAYL 
VLHAHAAIWKEREFLTSERTPIKHQEAIRK 
LLLAVQKPKEVAVLHCRGHQKGKEREDEE 
NCQADIEAKRAARQDPPLEMLIKQPLV 



2552 



748 



1075 



ILPTSLFFLFCFVFFVCF*DRVLLLSPG\WSA 
VARSWLYCNLSLRGFKGFSCLSLLSNWDY 
RCTPLRSANFVFL/CRDRVSPCWPTSVSNS* 
PQWIHPPWPPKVLGITRV 



2553 



B 



766 



MRPVDPDGTEHSLFCPLTALRGMVNSRIQ 
KSPGKPSVCDVPLPISPGQSSQLHGKVFGQ 
LNAGKAAEFLKSPPDHQAQAASTSGPQKT 
TLSKRGLRLQPCQLHSAPHSFQLLPLTQKS 
TWDLRGSAPLHAAQTSLSEFSCHRPDVED 
TLGTKGPDKTQCQSENSTRPQYSPETSQNQ 
PVGKGTDLKVTKLGVPSLMAQDGVNYSV 
KTEAHSTGTTAEPLSSQDRAVRGHNTDSH 
VQTPDLGEDTAL 



2554 



47 



923 



KATRHSAAFWLNKQGVSPAKLPHTSWS 
WSLQTLSFLFSGDLAEKSLQCFPCSAMLLE 
LIPLLGMFVLRTARAQSVTQPDIHITVSEG 
ASLELRCNYSYGATPYLFWMERTVEEAF1L 
LVCLKPWRVASSLEKKEKEDESFQLLLGSR 
YNVLKGSRGETSEGGAESFSSQSPGENQLY 
SEMQFFYLCEQRAVVPTESWVGLINLFFM 
ASWMKHSGKLWSKRNSEELCGTLHITAAQ 
LKDSGTYFCAVEAQFSQEICSLDPNCSWAC 
SPNPFRERGMLPPQYHLHSFGFSP 



2555 



2471 



2985 



ETSLERERLSFCTGSRTTRSAELKAVGFEA 
ALQEVITPEWPASQSEAYQTLRQNQAQV 
HNFFFFWGGDSPTLSPRLECSSAISAHCNLR 
LPGSSNSPTSASRVAGTTGACRHARLIFCIL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^Unknown, *=Stop 
codon, A=possible nucleotide 
deletion,=possible nucleotide insertion) 










VEMGFHRVAQAGRELLSSANPPTSASQSA 
G1TGMSHHAOPSSOLLISSCC 


2556 


A 


138 


564 


YREVMVSES *ETP AGARGRP YYFS APGTAP 

\PAINVHPPPPSLSATPHPPQPQPPPPHQHNA 

KARVATHOTCRTSNCMRSRKVRKSPPEKW 

VGFNRRPKASCPSPPGAARVDVGGETERR 

EQAAAPGEMGKWARPGEEYFHS 


2557 


A 


2 


585 


AAAAPAGGNPEQRLDYERAAALGGPDGR 

AWGGRSPLPPPAP*AQGAPGPRWPPPRAGS 

PAPSPAGCGGGKGGGLVTPGRGGPRAAGR 

EL/RAVRCPCPVRPRPPSKPALGGSLPQPEP 

AAAPGPSIR/PVLPIQTGSVPWRRPKSLRPVL 

GTRVGRTPPLPPP/PDPAGPPPLPLPGP\HPS 

RPPPPTGPWRPARADGRV 


2558 


A 


2 


224 


PRVRVQWAQLSQDKKGEMNSMTSTAGPP 

GSSSAPCATRRNLLQRQHLQRLSGEFKKDP 

ATYSKHLEPLEEERDK 


2559 


A 


43 


267 


GRLWSAMTPGKLKTLCKIDWPALEVGWP 
LEGSDDRSLVSKVWHKVTYKPRNPDQFPY 
RDT*LELVLDPPPPTHSG 


2560 


A 


233 


692 


DNHPSFPRLP S SRP GTKE VLKEIHISDTT AD 

VIFYPIYRMSEMIFRRIKMPWLWLDLWYL 

MFKEGWEHKKSLKILHTFTNSVTAERANE 

MNANEDCRGDGRGSAPSKNKRRAFLDLLL 

SVTDDEGNRLSHEDIREEVDTFMFEVLYTV 

RFRYH 


2561 


A 


1993 


1379 


SLHLSERADWQYSQRAG/DAVEVFFSRTA 

RDNRLGCMFVRCAPSSRYTLLFSHGNAVD 

LGQMCSFYIGLGSRINCNIFSYDYSGYGVS 

SGKPSEKNLYADIDAA WQALRTRYGVSPE 

NIILYGQSIGTVPTVDLASRYECAAV1LHSP 

LMSGLRVAFPDTRKTYCFDAFPSIDKISKV 

TSPVLVIHGTEDEVIDFSHGLAMYERCPRA 

VEPLWVEGAGHNDIELYAQYLERLKQFIS 

HELPNS*RQSK 


2562 


A 


991 


308 


AAASAFKPGLALSDRAFAAWEPSGAAVSR 

SPLSPPSRPFASREPAGFRAALADPPGMPR 

YEL ALILKAMQRP ETAATLKRTTE ALMD R 

GAIVRDLENLGERALPYRISAHSQQHNRGG 

YFLVDFYAPTAAVESMVEHLSRDIDVIRGN 

IVKHPLTQELKEWEGIVPVPLAEKLYSTKK 

RKK*EDSPDFSLICNSFTFGQHGREGRICKF 

GLYISMCCRCCLIFLRYF 


2563 


A 


1 


344 


MDKSLLLELPILLCCFRALSGSLSMRNDAV 
IEIVQCRMCHLQFPGEKCSRGRGICTATTEE 
ACMVGRMFKRDGNPWLTFMGCLKNCAD 
VKGIRWSVYLVNFRCCRSHDLCNEDL 


2564 


A 


251 


386 


LQRLECSGTI/SAHCNLCLLGSSNPLASAS*I 
AGTTGTLTGDVDST 


2565 


A 


1164 


1273 


EISNIQQADFPGVLATHPAFSRLLPCLHFIP 
KSANQ 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of f 
first amu*o 
acid residue 
oS peptide 
sequence 


Predated 
enrfng 
nucleotide 
; location of 
last amino 
acid residue 
of peptide 
sequence 


Amino add sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 


2566 


A 

i 

i 


867 


156 


PAPVKCEGPMYSASVKDQGPMVSAPVKD 
QGPIWAPVKGEGPIWAPVKDEGPMVSAP 
DCDQDPMVPEHPKDESAMATAPIKNQGSM 
VSEPVKNQGLWSGPVKDQDWVPEHAK 
VHDSAWAPVKNQGPVVPESVKNQDPILP 
VLVKDQGPTVLQPPKNQGRIVPEPLKNQV 
PIVPVPLKDQDPLVPVPAKDQGPAVPEPLK 
TQGPRDPQLPTVSPLPRVMIPTAPHTEYIES 
SP 


2567 


A 


625 


182 


QQGKNQECIRNQHTRAPGRGASPQQGEGK 
TWAWVGHPVPHALVTPGLQRGSARGLAW 
RQLGRAR*PRPPAPPRACRPEEPPYTPGRR 
AP GRP AP APRS ACG WAAS ASRWCRRTVFF 
SQ 


2568 


A 


2 


917 


EELLCLDVSENRLERLPEEISGLTSLTDLVIS 

QNLLETIPDGIGKLKKLSILKVDQNRLTQLP 

EAVGECESLTELVLTENQLLTLP*SIGKLKK 

LSNLNADRNKLVSLPKEIGGCCSLTVFCVR 

DNRLTRIPAEVSQATELHVLDVAGNRLLH 

LPLSLTALKLKALWLSDNQSQPLLTFQTDT 

DYTTGEKILTCVLLPQLPSEPTCQENLPRCG 

ALENLVNDVSDEAWNERAVNRVSAIRFVE 

DEKDEEDNFTRTLLRRATPHPGEIiCHMKK 

TVENLRNDMNAAKGLDSNKNEVNHAIDR 

VTTSV 


2569 


A 


481 


1380 


TSKQNAAPLVKYFQEKGLIMTFDADRDED 

EWYDISMAVDNKLFPNKEAAAGSSDLDP 

SMILDTGEIIDTGSDYEDQGDDQLNVFGED 

TMGGFMEDLRKCKIIFIIGGPGSGKGTQCE 

KLVEKYGFTHLSTGELLREELAS*SERSKLI 

KDIMERGDLVPSGTVLELLKEAiMVGXSLGD 

TRGFLID\G YPRE\VKQGEEF\GRRI WRPH S . 

WVTCME\(^ADT\MTNRL\LQRSRSSLPVDD 

TTK\TMAKRLEAYYR\ASIPV1AYYETKTQL 

HKINAEGTPEDVKLQLCTS*LTLLFSEGKN 

ACLG 


2570 


A 


3344 


677 


GAYHKHLMEIALQQTY QDTQNCIKSRIKL 

EFEKRQQERLLLSLLPAHIAMEMKAEDQR 

LC^PKAGQMEmNNFHNLYVKRHlTWSIL 

YADIVGFTRLASDCSPGELVHMLNELFGKF 

DQIAKENECMRIKILGDCYYCVSGLPISLPN 

HAKNCVKMGLDMCEAIKKVRDATGVDIN 

MRVGVHSGNVLCGVIGLQKWQYDVWSH 

DVTLANHMEAGGVPGRVHISSVTLEHLNG 

AYKVEEGDGDIRDPYLKQHLVKTYFVINP 

KGERRSPQHLFRPRHTLDGAKMRASVRMT 

RYLESWGAAKPFAHLHHRDSMTTENGK1S 

TTDVPMGQHNFQNRTIJR.TKSQKKRFEEEL 

NERMIQAIDGINAQKQWLKSEDIQRISLLF 

YNKVLEKEYRATALPAFKYYVTCACLEFFC 

mVQILVLPKTSVLGISFGAAFLLLAFILFVC 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










FAGQLLQCSKKASPLLMWLLKSSGIIANRP 

WPRISLTUTTAIILMMAVFNMFFLSDSEETI 

PPTANTTNTSFSASNNQVAILRAQILFFLPY 

FIYSCILGLISCS\WLRVNYELKMIJMMVA 

LVGYNTILLHTHAHVLGDYSQVLFERPGI 

WCDLKTMGSVSLSIFFITLLVLGRQNEYYC 

RLDFLWKNKFKKEREEIETMENLNRVLLE 

NVLPAHVXAEHFLARSLKNEELYHQSYDC 

VCVMFASIPDFKEFVTESDVNKEGLECLRL 

VLNEHADFVDDLLSKPKFSGVEKIKTIGSTY 

MAATGLSAVPSQEHSQEPERQYMHIGTMV 

\EFAFAL\VGKLDAINKHSFNDFXIJIVGINH 

GPVIAGVIGAQKPQYDIWGNTVNVASRMD 

STGVLDKIQVTEETSLVLQTLGYTCTCRGn 

NVKGKGDLKTYFVNTEMSRSLSQSNVAS 


2571 


A 


3222 


5798 


PLLTPLVSKVTAAGVPLFFFFFFFF*DIVSLC 
HPGWSAW*P*LTAASNS*\VKQSSHLSLPS 
SWDNRYAPPRPANYFYYFYFL*RLDLALFP 
KLLLNCWAQVHJPSQPPKVLGL*AQSSEGG 
IHSGLSLPSPCFLLCNPI 




A 
r\. 


1 


666 


ASSTPQVTANEEINVTSTDSEVEIVTVGESY 

RSRSTLGHSRSHWSQGSSSHASRPQEPRNR 

SRISTVIQPIJR.QNAAEVVDLTVDEDEPTVV 

PTTSARMESQATSASINNSNPSTSEQASDT 

AS A VTSSQP STVSETSATLTSNSTTGTSIGD 

DSRRTTSSAVTETGPPAMPRLPSCCPQHSP 

CGGSSQNHHALGHPHTSCFQQHGHHFQHH 

HHHHHTPHPCI 


2573 


A 


300 


110 


PCGPPQEKGADCHLKACPTAPCTTFRASCC 
SHPASCSRGKQASMSSTSSSATVPLPANEM 
HSG 


2574 


A 


2 


362 


QELERSMAQRCVCVLALVAMLLLVFPTVS 

RSMGPRSGEHQRASRIPSQFSKEERVAMKE 

ALKVFPTVVSTSnQHEVVEEYSHLFTIQGS 

DPSLQPYLLMAHFDWPAPEEGWEVPPFS 

G 


2575 


A 


1740 


2026 


ENGSLRPKPTGIPLSSARGNELSPTRRRRRP 
WTPNPAGETMSSVQQQPPPPRRVTNVGSL 
LLTPQENESLFTFLGKKCVGAGRGGRAPPS 
RAAGE 


2576 


C 


363 


692 


MLLWPLTQAQSSEMSCCRLGACFTTSLLHQ 
IPATALLEGNLDITLTVQLQILDAHNFPYRL 
CUDRCICFTSSSTYPQIDGLKSSRDIGDKISF 
VRSNGSINMGKPFNF 


2577 


A 


1 


2169 


MEGLNWLSLLAF1FLLCWMLSALKHQTPN 

SSAFGLLDmQWFATGSRMNKNNKPSSH 

AIRNAAFSEVGIGISANAMLLLFHILTCLLK 

HRTKPADLIVCHVALIHIILLLPTEFIATDIF 

GSQDSEDDIKHKSVIYRRNRQSQHFHSTOL 

SPKAPPEKMATQTILLLVSCFVTVYVLDCV 

VASCSGLVWNSDPVRHRVQMLVDNGYAT 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
cod on, /^possible nucleotide 
deletion,=possible nucleotide insertion) 






• 




ISPSVLPRLTAPNEWRASVYLNDSLNKCSN 

GRLLCVDRGLDEGPRSVPKCSESETDEDYI 

VLRAPLREDEPKDGGSVGNAALVSPEASA 

EEEEEREEGGEACGLERTGAGGEQVDLGE 

LPDHEEKSNQKVAAATLEDRTQDEPAEES 

CQIVLFQNNCMDNFVTSLTGSPYEFFPTKS 

TSFCRESCSPFSESVKSLESEQAPKLGLCAE 

EDPWGALCGQHGPLQDGVAEGPTAPDV 

WLPKEEEKEEVWDDMLANPYVMGDEGE 

EEEEEFVDDTLANPYVMGVGLPGRGGEEE 

EEEEWDDTLASLYKMGEEHRHKGLAPL 

WEGGQKPSQKLPPKKPDLRQVPQPLASEV 

PQRRQERAWTEGRPLEASRALPAKPRAFT 

LYPRSFS VEG QEP VSIS VYWEPEGSGLDDH 

RIKRKEEHLSWSGSFSQRNHLPSSGTSTPS 

SMVDIPPPFDLACITKKPITKSSPST ,T IDSDS 

PDKYKJCKKSSFKRFIALMFNKMERPGTM 

AHACHPSTLGS 


2578 


B 


1 


360 


MHLLQAALLLAVPCLLCYVAVGYAFSVLL 
TLLLTAPALLPDDFEGFNIREKTGWYGKKE 
GMVTLSNPQVAREKEQFNDLYFNAKQAE 
QKGYLNTARRE A SLAFK VTETTHNKSGLIT 
ES 


2579 


A 


1 


1036 


ATVGGREIYVKGFVHYKVRALFPCEKPPRP 

TEMSRHHSRFERDYRVGWDRREWSVNGT 

HGTTSICSVTSGAG/ERHSQQPQRPARPPAA 

ARG ALP AAHP G YS S CS L/RPPAAARPSP AS 

WPALRLRSPPRLPASPKGTVSPRDWRPASG 

GGRRLSISPHPG/TTDEPPSKQMRESDNPGT 

GPW\GPRWPPGTSPP*SHTPMEWPSLPPS\P 

GCERPGPGHWGDPLTASPRGAPAPADARP 

L\PLPQPPSQPLSS\GWSTCLPRPCMPALSP 

WPCPHCPVWGRWPAQDPPLWATATWQG 

PCCLHRRQPSRPPLSPWPLPPMGPPQFTRP 

TGCRCCGPLAWGSMSSPTRGTPE 


2580 


A 


1 


1535 


MEEKTNVQLPPGQTEQHVEIHIMNFCSKN 

HHRITPEKPKELTDPFKEAACCCKLYEIDK 

KLYRMAEWIKIHKPSICCLQETHLTHKDSH 

KLKVSITFKDLAVRFSEEEWRLLEEGQREF 

YRDVMRENYETLVSVEPGRAVGGGSHAD 

EGQEPAGCGA^SPGPGAAGEGDPRVLVWR 

SQGRYGQPRER\GRGASLDGERASPEAA/D 

GKRALPSPRPAQLPSRRPYQPAPPG\PTPTD 

SSCSSGPTGDGVQGSPLPIRISPGNSPUPRP 

HQLSEGNPCAWAPAPRDIPKLLATSP*PGH 

VQANQSRPGAWEPALGRSDQRACSASGSA 

ELCERWPQQAP/APPEEPPPASPHPAAPTG\ 

PGFWESCGEPGAAVPGKGSAPKPSPLHCLE 

S ALRGILP\EGPC A SP A WE AP AP AP AP AP AR 

ASAA/AEGEDPRPEPELWKPLPQERDRLPS 

CKPPVPLSPCPGGTPAGSSGGSPGEVAPGEQ 
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Table ? 


> 


SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

JL m WulViVU 

ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










SPGTAAASVQ/VSPAHWPCFS/SPVRYSSGS 
LPGFSAGEKAQG 


2581 


A 


3 


514 


PRLLMEAGPHPRPGHCCKPGGRLDMNHGF 
VHHIRRNQIARDDYDKKVKQAAKEKVRR 
RHTPAPTRPRKPDLQVYLPRHRDVSAHPR 
NPDYEESGESSSSGGSELEPSGHQLFCLEYE 
AD SGEVTS VIVYQGDDPGKVSEKVS AHTP 
LDPPMREALKLRIQEEIAKRQSQH \ 


2582 


A 


307 


1503 


GGSSARPRASSRRMIJSRKKTKNEVSKPAE 

VQGKYVKKETSPLLRNLMPSFIRHGPTIPR 

RTDICLPDS SPNAF STSGDGVVSRNQSFLRT 

PIQRTPHEUVCRRESNRLSAPSYLARSLADVP 

REYGSSQSFVTEVSFAVENGDSGSRYYYSD 

NTFDGQRKRPLGDRAtlEDYRYYHYNSDi^ 

QRMPQNQGRHASGIGRVAATSLGNLTNHG 

SEDLPLPPGWSVDWTMRGRKYYIDHNTNT 

THWSHPLEREGLPPGWERVESSEFGTYYV 

DHTNKKAQ YVRHPC APTCTS V* STTSCHI/A 

S/RQQTERNQSLLVPANPYHTAEIPDWLQV 

YARAPVKYDHILKWELFQLADLDTYQGM 

LKLLFMKELEQIVKMYEAYRQALLTELEN 

RXOROOWYAOQHGKNF 


2583 


A 


1341 


1015 


LGTRGCLNMAAPLSVEVEFGGGAELLFDG 
IKKHRVTLPGQEEPWDIRl^LIWIKKNLLK 
ERPELFIQGDSVRPGILVLINDADWELLGEL 
DYOLODQDSVLFISTLHGG 


2584 


A 


1 


741 

• 


VRSMSCPPSWPYCAPCPTNIGESTSPLRKTI 

ETPTLWDPKAPSCSLELPPWVLASPQRSRG 

TALPFLPSNVLPSLALPSTSFLCRPLLSHLV j 

TSLLAGPGAHDGHLRKEGWRSTPEMTSLP 

APEHPASPCDSVLCSPDVSMCTLGPAARW 

DAQAKSAPLPPCCTDCKSFPHLQRPWAQP 

HTSQATSVDSGEAGTKGMSQFIVWTWWR 

SRPCETRQGEGIGNWGYSVTPGPPGSQNLP 

ARLDGQGLAS 


2585 


A 


36 


363 


NMISIJPmWAFCKffiNLCGKCVYMCMCSQ 
NKNNQLKFSFIPGRWCASLKMYSKGQRSL 
MYPCRYHQRMLLVSRYLDTVLLDWDPPG 
PLPEGRQHSP GRRQRDLAS ALLC 


2586 


B 


1 


1107 


MLYWlJvlPKGKLLWIASFLTRJUJGIQHTLP 

RVEEKSIQSVKDDNIYHPHPRPR1AVVGSSS 

TVISYSPGEYAFTNGTSRCPSLSLAAGPRLI 

TNGPWEAHEVQRESTIALMKLLQVLEQKV 

RLREGHSLGTVKMSKNINPMGHVSNPPTS 

YPDELITKQVCPGSHPKRPGEVKHNEEVPT 

SQDRDTCTTQETQYSVRKnSAEDDFTVKN 

YNHIRNKFTIPSRKGQQAHRAWLNKAIPQP 

MPTSATSLLAALVRAAKHRNQQPQDLAQS 

SSHHIYLFITITFGSLRDSELKSKRGPDPQLS 

LELEMVAKAKAVKPENSRRWFSGNQLGSI 

INSPKKGSAVLEGTFQEKOKWDARLTKGD 
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TableS 




SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
de!etion,=possibte nucleotide insertion) 










CLATNVLNRV 


2587 


A 


1 


384 


MACRVLQGLPFACLSSPICSHSALTLDHSL 

LCI>JFFFNPYALIJ?WYLFGKFQLRIPSSGK 

PFLTSEDQDPAGIFELVEWGNGTYGQVY 

KVRRMVWKYDLHICRAVLGGVEGSRFLV 

CRSEGGYGRC 


2588 


c 


1 


417 


MLLPLFLLIHTGIGPSYSASDRAEPRPSPGG 

RLTARIWIKGVKEDGGTMQGAVDWGEGV 

ERCAGRITASHKVADKWHSSRNSLGGSPE 

PGTPAPGPWVGFCHPCLPASPLSWTATGT 

AATHAQCAERVHNLCRRAKPS 


2589 


B 


1 


198 


MQAGLARAMVLAAGWSRVASAGAAGDT 
SPVPRALSDLRITQKCGLLVPKAVSWKSLF 
LFPITVEL 


2590 


A 


267 


614 


MA VA VLLCGCIV ATVSFF WEESLTQH V AG 
LLFLMTG1FCTISLCTY AASISYDLNRLPKLI 
YSLPADVEHGYSWSIFCAWCSLGFIVAAG 
GLCIAYPFISRTKIAQLKSGRDSTV* 


2591 


A 


5 


447 


SSAFRSVLLEMRVSSRTCIIDTLQGAVPTYP 
GSGTPALGEKSGSLGLVAWSFPRPGESSST 
APRRSPCCCPWSPSHSSPASFPPLRPSAPAT 
RAPREGLPTPASRAHFPGATAIPKTSGLLIA 
TASLCWGQTHQPCPLPLARFLGKR 


2592 


A 


508 


870 


GHCPVLRWTEKHCRACEKEGMDSSIHLS 
SLISRHDDEATRTSTSEGLEEGEVEGETLL1 
VESEDQ ASVDLSHDQSGD SLNSDEGDVSW 
MEEQLSYFCDKCQKWIPASKELLNSFDLSI 
PV 


2593 


B 


20 


201 


MGRVSGLWSRFLTLIJVHLWVITIJWSRD 

SNIQACLPLTFTPEEYDKQDIHALPAVTEM 

ALFVTVFGLKKKPF 


2594 


A 


79 


243 


MSFICFLNFWPTSAIPLRLWNYCGMNSPS 
RSWDCLCTPLSRQSAPVSHMAKVW* 


2595 


A 


178 


1224 


RYRAARNVMKDQRLVFHSKVRSSGYASA 

PH VTMFSPKTNIKSEGKGSSRSRS SCAREA 

YPVECAVPTKPGPQVAAAPTCTRVCCIQYS 

GDGQWLACGLANHLLLVFDASLTGTPAVF 

SGHDGAVNAVCWSQDRRWLLSAARDGTL 

RMWSARGAELALLXRYKQKSKSKLICRLST 

TGAVDMTSLSAVNDFYSHIVLAAGRNRTV 

EVFDLNAGCSAAVTVEAHSRPVHQICQNK 

GSSFTTQQPQAYNLFLTTAIGDGMRLWDL 

RTLRCERHFEGHPTRGYPCGIAFSPCGRFA 

ACGAEDRHAYVYEMGSSTFSHRLAGHTDT 

VTGVAFNPSAPOLATATLDGKLQLFLAE 


2596 


A 


85 


839 


RSGSDvlAAAAATKILLCLPLLLLLSGWSRA 

GRADPHSLCYDITVIPKFRPGPRWCAVQGQ 

VDEKTFIJHYDCGNKTVTPVSPLGKKLNVT 

TAWKAQNPVLREWDILTEQLRDIQLENY 

TPKEPLTLQARMSCEQKAEGHSSGSWQFS 

FDGQIFIJLFDSEKRMWTTVHPGARKMKEK 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 

X 1 vUlVlvu 

beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










WENDKVVAMSFHYFSMGDCIGWLEDFLM 
GMDSTLEPSAGAPLAMS SGTTQLRATATT 
LILCCLLILLPCFILPGI 


2597 


A 


319 


513 


EELRAVAQGIAQSLGQLLFTQCPLEKKDLE 

GLFLQNNKEGVQKGRDEPLPPLP*ATALSS 

IQAGIQQAR*EGDLEAWQFPVRIHPPDQQG 

NIIVTFEPFPFKLFKEFKQAVNQYGPGSPFV 

MGLLKNVAVSSWMIPTDWDALTRACLTP 

AQFLQFKTWWADEAGRV 


2598 


A 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFA 
CQRFLGFAVFLLPWASMWLRSLLKPIHVFF 
GAAII^LSIASVISGINEK1.FFSLKNTTRPYH 
SLPSEAVFANSTGMLWAFGLLVLYILLAS 
SWKRP 


2599 


A 


54 


470 


CSTMNPSEMQR1APPRRQRHRSRAPSAHK 
MNRMVM SEEQMKLP STKKAEPPTW AQLK 
KLTQLAKKK\LENTKVTQTPENMLLAALK 
TVSTVSAGWSSSEESDHRERAMMTTVVL 
SKRRGKCGEKKEISDCY CVYVERS 


2600 


B 


1 


939 


MALRLVIPALWEAELVGALMLAALSHLHR 

FLLSMWVLPPGTFTDAFPGLLFHFPRRSQK 

DCLLGLSKSDQRAMACYFGILLIVSATLCF 

GMDNTyyiX)EFANLIJI)ELLMKINGLSDSLQL 

PLLEKTSNNTGEARTEESPLVDISSYQAAE 

MVMMARTLATCLQHAQGLGFEACLP1LSA 

PHALSHWTLTTCLWQLGFMSAVLILKYTR 

ALLAQGQFSGPFVIDKGVRLELIGLISRVW 

EVSEQENSKEEVYRHEEG1TVISDLLLGRQ 

WQQGHKGICLQLMLPFSRGKHRTSGAFLM 

FSLELFTVAQLVPISGS 


2601 




1 


698 


VLNPLGKP*HDTPAWHEEGYPFPTAPPVDP 

FAKIKVDDCGKTKGCFRYGKPGCNAETCD 

YFLSYRMIGADVEFELSADTDGWVAVGFS 

SDKKMGGDDVMACVHDDNGRVRIQHFY 

NVGQWAKE1QRNPARDEEGVFENNRVTCR 

FKiyPVNVPRDETIVDLHLSWYYLFAWGPA 

IQGS ITRHDIDSPP AS ER WSIYK YEDIFMP S 

AAYQTFSSPFCLLLIVALTFYLLMGTP 


2602 


A 


2 


319 


FYLFILFLFFVFLVETGFHHVGQAGFELLTS 
SDPSALASQSARITGMSHHAWPNFCLLSRD 
QVSPCWPGWS*TPDLR*STFLGLPKC*LQA 
*ATVPSAGEPOCGQ 


2603 


A 


147 


773 


MGLGARGAWAALLLGTLQVLALLGAAHE 

SAAMAA S ANIENS GLPHNSSANSTETLQHV 

PSDHTNETSNSTVKPPTSVASDSSNTTVTT 

MKPTAASNTTTPGMVSTNMTSTTLKSTPK 

TTSVSQNTSQISTSTMTVTHNSSVTSAASSV 

TITTTMHSEAKKGSKFDTGSFVGGIVLTLG 

VLS1LYIGCKMYYSRJRGIRYRTIDEHDAII* 


2604 


A 


2 


331 


WWSSPITARDALX3IKHTMVKIRPLSQATR 
AAKAKARAYAEFLQPAKERPETSAALARR 
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Table 8 



ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possibIe nucleotide insertion) 










LVISALGVRSKQSKTEREAELKKLQEARER 
KRLEAKOREDIWEGRDQSTV 


2605 


A 


549 


641 


CCCCCCCLCFGIHSSKGTHSANSDKWPFDP 


2606 


A 


1 


517 


SCYVCGGTVTGDQWP*EARELVPTDPVPD 
EFPAQKNHPDNF*VLKVSIIRQYCTA1EGKQ 
FTHSIGRI^CIJ^QKLYNGTTKTVTWWNSN 
YTERNPFSKFPKLQTVWAHPEFHWDWMA 
PTRLYWICGHRAYAKLPDQWTGSCVISTIK 
PSFFLLPIKTGELLGFPVYASHEKR 


2607 


A 


2 


406 


FLVETEFCYV GQAGLELLTSRDPPASASKG 
AGMTGVSHQVQPQ**S*LWT*/PSSVEAGT 
SFGLSFLSSSWALSAQEGCLAVPS/SGSRGL 
LVGALLLWTKPSPQLSPVPASQRLSSLSLM 
PPLPOPOHLTHTSIET 


2608 


A 


2264 


37 


FFFNKNLLFIQKLTPGWSPIFKJmCKRGGQ 

GFPSQCP*VNSLAIQGWPSRGVSGKRCQKC 

GGPGPLRTHSPLLASPLQPPSAVTTRPVGLQ 

PPGAL\GLTTTRGRAALP*LP*N*MLKPRW 

EQGDFPPGGWAMEAFSRDSLPLQEGIPGIP 

TSPPTPSEK\NKVPETPGALV*ETGCQTEKH 

FRGGDVSTEGDTYACLDVILNVACLDHGK 

SEHSPKSPSTQSEEQTLRGRGQAVADWPPG 

AGACPGPSARLCRGTMGMPSASEHLKRAA 

LGGK/PPLWRGARAAQEAPGSGFCGITAAR 

GLGRGGGRDRSLPGKL* *KWPVSSTPPGPG 

RAALPAALGW\PGCGPTGM/PGLRSASIPSA 

KARSHTCGFKPKG/LKGRTMEEGQTHRRG 

PHA*AQTPSATGQWQQC/PVPLDQRGKSS 

LRQRPKESNLT\GKDLPHPLSPKPPC\RSLPQ 

TPGQSPAEKLQPLVLSPRSPGPAAEQGAD 

WQGPQR1HPSKWPVKVEPLTPSLQDVGGG 

GGVTVGPACSPRGLPMNASGGTLGLAECS 

SQGEQPRSPTRQRHHGRGLPRAGGLLAEG 

GNRGPKCypPLKHGLMGC*LCKAAARILDP 

GLALTVWEAASH\PSLPCARTPSGSQRALK 

GLGGTRKCCGKGQGVPHD\NSSAGTOPTH 

QQPRNRGCA/GDSDSPSGCWGQANLTTAS 

PATGN*TPGLE*HDVGMEKGLQDQ\QPGPP 

RSADGATETQRGQEAAHNQRARGRTLGS 

YLWSRVGSHSW 


2609 


A 


1 


399 


MDGQARWLTPVIPALWEAEVFIEHMLYAL 
NTLRTVLGRARTLS LNHRCRLLLLSLL VLH 
CVRSVRSWYLFCEAAAEKTLAFAMAEEKP 
KALSMGQIRFRFDSQPINETDTPVQVEMED 
IDIIDVFHOOIGGVY 


2610 


A 


1 


1641 


MGELHMITEEKHQPFMDTQTAAKGTLLEA 
GPGLDPVCLGHIKKVIQRKFWRYSAPGTVP 
TTSAIPGETEWGRLPQWSTAWSETAQHGW 
PAARQSRTTVLHQQPQCDPGPEVTSEQLPG 
VINMLTLKYIKVAAHPHGSWNTRVPCLVA 
VLLTPTRJ^YYTSEIQTTFREYYKHLYENKL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=*Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ENLEEMDKFLDTYTLPRLNQEEVESLNRP 

MTSSEIEAVINSLPTKKSPGPDGFrAEFYQR 

YEEELVPFLLRLFQTIEKEGILPNSFYEASIIL 

IPKPGRDTTKNENFRPIS LMNID AKTLNKIM 

ANRIQQHSKKLIHHNQVGHSGMQGWFNIC 

KSINIIHHINRT^KNHM^ 

HPFMOCALNKLGDDGTHLKnRAIFDKPTAN 

IILNGQKLEAFLLKTDTRQGCPLSPLLFNVV 

LEVIARAIRQEKEIPAPADTSSIIAHHPSPS 

YQPWTPVTRTSHSTPTTTCYPCLECTPAKW 

LTSVSTMGGGLLSVPQGTVRVSALNYCFP 

QLGGGPLMASSASSDYVPESDESEPLFTFE 


2611 


A 


146 


411 


LLSPSHPLTAPPPRPPRPPPTRAPGACASSM 
GPPTSKFPKDLTLPGDAALGCGTPATGGEG 
ASSRARSETQRARAPTPGRSWGRAGSA 


2612 


A 


2 


384 


PIOJSRPTUtPSRSKVSLIEGRGANMAAR 

WRFWCVSVTMVVALLIVCDVPSASAQRK 

KJEN1VLSEKVSQLMEWTNKRPVIRMNGDK 

FRRLVKAPPRNYSVTVMFrALQLHRQCVV 

CKYELQLRFKIK 




A 


1 


626 


SRVEDFVLHLLRALAQDD VVP YFKTEP GL 

PQIHLEGNRLVLTCLAEGSWPLEFKWMRD 

DSELTTYSSEYKYIIPSLQKLDAGFYRCW 

RNRMGALLQRECSEVQVAYMGSFMDTDQR 

KTVSQGRAAILNLLPITSYPRPQVTWFREG 

HKEPSNRIAITLENQLVIIJ^TTTSDAGAYY 

VQAVNEKNGENKTSPFIHLSIASFCGNTTQ 

D 


2614 


A 


412 


1 


SNLCLGNSWRWRWAKSRHHCIPTVTLSKR 
SGDIRGSHFSSPQRQRSQRVPGKETARVLR 
AGKQGRGQIPIPCPWPPPPPPPPPGSPGPGC 
RQFHQSLEAKARHPASVREMRGKVKMRR 
ALRRAPASTRASSRQPNPK 


2615 


A 


2 


474 


TGPTIKNMDGTFWTSCLKLNSSQEDPGTV 
YQCWRHASLHTPLRSNFTLTAARHSLSET 
EKTDNFSIHWWPISFIGVGLVLLIVLIPWKK 
ICmSSSAYTPLKCILKHWNSFDTQTLKKE 
HLIFFCTRAWPSYQLQDGEAWPPEGSVNIN 
TYSTTV 


2616 


A 


223 


2210 


SLSGFTREASFEMAAQRIRAANSNGLPRCK 

SEGTLIDLSEGFSETSFNDIKVPSPSALLVD 

NPTPFGNAKEVIAIKDYCPTNFTTLKFSKG 

DHLYVIX)TSGGEWWAHKrTEMGYIPSS 

YVQPLNYRNSTLSDSGM1DNLPDSPDEVA 

KELELLGGWTDDKKVPGRMYSNNPFWNG 

VQTNPFIJ^GNWVMPSIJDELNPKSTVDLL 

LFDAGTSSFTESSSATTNSTGNIFDELPVTN 

GLHAEPPVRRDNPFFRSKRSYSLSELSVLQ 

AKSDAPTSSSFFTGLKSPAPEQFQSREDFRT 

AWLNHRKLARSCHDLDLLGQSPGWGQTQ 

AVETNWCKLDSSGGAVQLPDTSISIHVPEG 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










HVAPGETQQISMKALLDPPLELNSDRSCSIS 

PVLEVKI^NLEVKTSIII^MKVSAEIKNDLF 

SKSTVGLQCLRSDSKEGPYVSVPLNCSCGD 

TVQAQLHNLEPCMYV A V VAHGPSILYPST 

VWDFINKKVWGLYGPKHIHPSFKTVV\TIF 

G\HDCAPK\TLLGSGE\VTRQAPNPAPVALQ 

LPQDLKVCMFSNMTNYEVKASEQAKVVR 

GFQLKLGKVSRLIFPITSQNPNELSDFTLRV 

QVKDDQEAILTQFCVQTPQPPPKSAIKPSG 

QRRFLKKNEVGKnLSPFATTIXYPTFQDRP 

VSSLKF 


2617 


B 


10 


462 


MSGWLGLVSSLHRLLVSPCPGRTVGLQRR 

KRIiCSGSSRMSFPVTRJ^REQTPHPDIVAAI 

PSGTDDFQGHRSKEKENWKPMCLNRFILE 

EC1AADDFRIRGLEPNPQYLQGKPTQVSES 

LRLLRNDTQDPNIKTRYIMNLAKTIQRSPD 

K i 


2618 


B 


1 


406 


MHIPKNLNMCALQSKPESRGFGELSQRGN 
VKFNVETLCSHQKKISRLSAAIHQLDISDIR 
PLTVLLTLCITIALIJV[RGAQPGMNSGKIPY 
RMFIPNSHSDSELMSFQDSVRHRRGGFQTF 
DCDSQQETFWTWSIX 


2619 


B 


1 


789 


MGRERDPSGWTWLLRCAAAAC ALLLG SQ 

RQETQLLLSEHSDPDIEHRVRGEPKRTTRW 

LGVECWRQGVINIETKAQEQLQPKGKKVS 

SLLTALPGSIDELSLKRDVKESISLPAVPFQI 

ELLLISKINMQTRLLQLPLKFAVAAASSRF 

NPRPPVIGQLLRGKKSTPWQPDKPIKSPAG 

VTAATLQAGVGWAEEQSGHCAQVHSLGV 

DSSCWSPRSGYTYVHHPVHTPTLCALVGS 

GGERGGGEGEKHIGLEEQEPQKRVLN ! 


2620 


A 


3 


913 


FMTDVNSWLLTFGFQUDSTVTPGYPKPDMD 

AMEPSYELIHTQMKTQEWDNSKSILGVQC 

EVQKQLKAFVTLERFDQLYGSTITSCQQAP 

KTKKPASSGSWGKGVKFAIJ^GRVTTDII 

SVANEDGRRVAAIUSfHAHYLENLHFITDG 

VDTHYFVKPGPSEGDLAILGLSGGRRTLEN 

GVNVWSQINTVLNGRTRRYTDIQLQYGA 

LCLNTRYGTTLDEEKARVLELSRQRAVRQ 

AWAREQQRLREGEEGLRAWTEGEKQQVL 

STGRVQGYDGFFVISVEQYPELSDSANNIH 

FMRQSEMGRR 


2621 


A 


30 


2298 


LTRAPDPDR VG L V ADFLRLF IPT AKGP VIN 

APLPQRLRSNTAPIRTLHAPSVHRPTGRES 

MPRTRLTRARTSPDTTGSDKTPHPRPKTLPI 

QTRSCADSGKLSEIRK1DDPLQHHLQNQSI 

QKSVKQCHEQNMFGNIVNQNKGHFLLKQ 

DCDTFDLHEKPLKSNLSFENQKRSSGLKNS 

AEFNRDGKSLFHANHKQFYTEMKFPAIAK 

PINKSQFIKQQRTHNIENAHV CSECGKAFL 

KLSQFTOHQRVHTGEKPHVCSMCGKAFSR 
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Table 8 




ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










KSRUvIDHQRTHTELKHYECTECDKTFLKK 

S QLNIHQKTHMGGKP YTCS QCGKAFIKKC 

RLIYHHRTHTGEKPHGCSVCGKAFSTKFSL 

TTHQKTHTGEKPY1CSECGKGFIEKRRLTA 

HHRTHTGEKPFICNKCGKGFTLKNSLITHQ 

QTHTGEKXYTCSECGKGFSMKtlCLMVHQ 

RTHTGEKPYKCNECGKGFALKSPLIRHQRT 

HTGEKPYVCTECRKGFTMKSDUVHQRTH 

TAEKPYICNDCGKGFTVKSRLIVHQRTHTG 

EKPYVCGECGKGFPAKIRLMGHQRTHTGE 

KPY1CNECGKGFTEKSHLNVHRRTHTGEKP 

YVCSECGKGLTGKSMLIAHQRTHTGEKPYI 

CNECGKGFTMKSTLSIHQQTHTGEKPYKC 

NECDKTFRKKTCLIQHQRFHTGKTSFACTE 

CGKFSLRKNDLITHQRIHTGEKPYKCSDCG 

KAFTTKSGLNVHQRKHTGERPYGCSDCGK 

AFAHLSILVKHKRIHR 


2622 


B 


1 


2034 


MKLMETI^QCINAGHEMTKAIAIAQFNDD 

SPEARKTTRRWRIGEAADLVGVSSQAIRDA 

EKAGRLPHPDMEIRGRVEQRVGYTIEQINH 

MRDWGTRKRRAEDWPPV1GVAAHKENT 

LLPFYLGEKGDVTYAIKPLAGRGLTYFFLS 

GSARIENElJVlGKFVERiCLATHTTLSFDWPL 

ETTPQLLPPHILSPVFASASPSRCWRVASGK 

YCKVFRGSGFQAQXIPQPTLRDPIIYVEDK 

GHKYLVFEAOTGTENGYQGEESLFNKAYY 

GGGTNFFRKESQKLQQSAKXRDAELANGA 

LGIIELNNDYTXKKVMKPLITSNTVTDEIER 

ANVFKMNGKWYLFTD SRGSKMTID GEN SN 

DIYMLGYVSNSLTGPYKPLNKTGLVLQMG 

LDPNDVTFTYSHFAVPQAKGNNVNRFTQF 

RLSETKEITNPYAMRLYESLCQYRKPDGSG 

IVSLKIDW1IERYQLPQSYQRMPDFRRRFLQ 

GQFDHAASPVERGHLRKIPFRGGTRESRER 

GLSEAGYLPREAGQAQKRRPWTKGPLEKI 

GLETHJODSRRYPCRSNWVWICTVKEGGR 

EGRGGRGRRVQLAAVAGTVAPAAAPKNP 

PPRFRWSVWARDGVKERVPLQAGVGGGQ 

AVQRRETARRSRGWLLRIWDSIGRDRSLG 

GNGFFTTADQRFDFAVLWLVAFRINSDKL 


2623 


A 


513 


796 


TGTAWTPPPPPLTTGAPCTPPPRCTARGRT/ 
PGDSHLGGGPAATAGGPRTSPMSSGGPSAP 
GMRPPASSPKRNTTSLLNSGLEPTFSFRITF 

GFM 


2624 


C 


60 


472 


MPLLEYARNMLRTWS SLP WTRFRVCLLSL 
SLFLWANRLEDSRSCQPNPMSLTTLPGHRL 
KEAVWO>APSRTMSPHLDPNQLGILLRVLR 
KEKEDGD YPDMMATHP S SRYEACSS GITL 
AAPPTHGPRPTDPRIGPAP 


2625 


A 


1 


1322 


MAII^KVIYRFNAIPIKIJVTFFTELGKTTLR 
FIWNQKRACIGKSVLSOKNKAGGITLPDFK 
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Table 8 




m 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /-possible nucleotide 
deletion,=possible nucleotide insertion) 






- 




LYYKATVTKTAWYWYQNRDIDQWNRTES 

SEIMLHIYNHUFDKPDKNKKWGKDSLFNK 

WCWENWLAICRKLKLDPFLTSHTKINSRW 

IKDLNVRPKTIKTLEENLGNTIQAIGMGKD 

FMTKTPKAMATKAKIDKWDLIKLKSFCTA 

KETTIRLLGRPP ALFTAS S S VLKQLALEGILI 

LDSRALLGFLYEARHSHSNSPNHDAQNAT 

SKKNIRDGYDKIYRQEQVLARMEEKTLITA 

GG>TVKWCSHFRKQIGGQWLTLETKTKTPQ 

PFSSTSQISTDKDKGLNPQLLKMDPGHMG 

WCPPGMGIPWQLSSDDRVWVLAAAGSGR 

HPGSGFK.SL/PGLLHEGSYGH** * *S *I*GGN 

S*GSSGGPOaSGEERVFRWQSI 




A 
r\ 


129 


329 


VSNIVDPHQTVGLSTQEPGDIFTYSEFDGIL 
GLAYPSIASE*SWVLDNTMQRHLVAQDL 
FSVYMSR 


2627 


A 


43 


456 


EFFHHVGQDGLDLLTS *S AHLGLLKCWD Y 

RREPPRPASDGHY*TDATGSLPSSGTT*IRT 

KPSQAPASWGLWNLAHHPPRSHPSCPMAN 

LICSTLSSFDGGSPGTGPGGWCPLGLSGSPA 

RAVFKDSSCSLHPLATG1 


2628 


A 


3 


290 


RQGFPLCNHKGTVTADLQPLPPGLK*ISHL 
SLLSSWNYRCTPPHPADF*FFVERRSHYVA 
♦ACLELLCSSDLPALISQRVGITGMSTrPGPI 
CLL 


2629 


B 


1 


804 


MVIVGLAAGVLLVGPGDGGLISEGVVRED 

LMCGVWSAGTWSVGTAERCLEKPGALHV 

IEGPLDSWDGPVMPNGPVKNHKGEQQEVP 

SKHPQMALEICLCLDFLYYPFLRGDASAGP 

VTWCTTSDTIILQQHRTLTSQGVDDFLKAK 

ATFKASDFIDALVLSKDLNSGGRMELEIKC 

LIKVLELDLEGSGEPWKVLDKGVTVSYVF 

EMTIEGCLEGVNKSQETREGACGAGLEMA 

KEGSCLDERSSGTVSGYTQVSSELVCSGFL 

SPG 


2630 


A 


322 


549 


GGGSSPRELAGAAGLTVTSQAVAARRQQP 
SFSRARAPAHSLRAALSLASSARSWGAVSR 
DRGPCPPAIMYQSSNKC 


2631 


B 


1 


384 


MLVPVULSPCLVGIEPWEVSPHTNSTSSYE 
STPKSYPLGTAAKAASGQSPSTTSPLPETAP 
STLHERGLENWCSDKDLRQATGYSAAEK 
SKPPGLCTRAFCPEAIPDAQDWVKCQPLGS 
LSALNF j 


2632 


A 


1 


275 


KTSQDTKPSVLWKDVNSNLWCRPHDLLT 
WGRGYACVHIPSGPLGIPVQCIKPYHGMA 
GTQCSTGNEECEPVGPAAPDNAASSDNTG 
PGWGM 


2633 


B 


56 


3476 


XGKPEKFSFGLLDLPFRVGVPFNIPLEFQDE 
FGHTSQLVTDIQPVLEASGLSLHYEEITNGP 
NC VIRG VTAKGP VNS CQ GK V APNLP VYW 
DCSS SGTSILTGS AIQVONTKKDOTLKARIEI 
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Table 8 



ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=ITnknown, *=$top 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PSCKJ)VAPVEKTIKLLPSSHVARLQIFSVEG 

QKJUQIKHQDEVNWIAGDIMHNLIFQMYD 

EGEREINITSALAEKIKGLLPDVQVPTSVKD 

MRYCQVSFQDDHVSLESAFTVSMLELLQL 

MVSLKTSNLLNNFRPLPDEPKHLKCEMKG 

GKTVQMGQELQGEVVHITDQYGNQIQAFS 

PSSI^SL^1AGVGIJ)SSNIJCTTFQSIPVINGR 

DLQNPHVQLCDQWDNPAPVQHVKISLTKA 

SNLKVKAlWKSnEGPIIKLMILPDPEKP.VR 

LNVKYDKDASFLAGGLFTGYVRPVPVPRS 

LNSDISYFGVGGKQAVFFVGQSARMISKPA 

DSQDVHELVLSKEDFEKKEKNKEAIYSGYI 

RNIOGSMFEKGKWKIV>ILREIQDDMQTLY 

VNTAADSFEFJCAHVEGDGVVEGIIPYHPFL 

YDRETYPDDPCFPSNNFGISFVHSLEVILXL 

KDEDDEDDCFILEKAARGKRPIFECFWNGR 

LlPYTSVEDRGLAPIECY>nUSGALFTNDKF 

QVSTNKLTFMDLEDO.KDKNTLFTRILNG 

QEQRMKIDREFAL WLKD CHEKYDKQUCFT 

LFKGVITRPDLPSKKQGPWATYAAIEWDG 

KIYKAGQLEPQALYDEVRTVPIAKJLDRTV 

AEKAVKXYVEDEN1ASLWILGYKPVQHMT 

VLSTAGNCNTTFWKKimTVILRCRSLTKV 

LLATERTFETAGVGGLILGQVEEARLKEAQ 

LRNELKIHNIDIPTl^QQVPHIEALLKRiCLSE 

QEELKKKPRRSCTLPNYTKGSGDVLGKGQ 

STGLGPVEVTQSSPSSRTSEYFWLTKFCWL 

EDWASGESLRLLPLMVEGEGEPVYAEIIW 

QKRDETVKDGVTLYLLQSVNQLLLTATKE 

RIDFLPHYDTLVKSGMYEYYASEGQNPLXI 

YTHVGDREAQAALKLGRWSHPRTPNAVG 

APGPPEGAGGGDAVTSQSALLTFSRTRFAS 

GAHAGAHPVLLRNEEEKGAPALVAP1FSAE 

GPTCSLWWTLRPASTAGLKLPARRVHATQ 

PERAH 


2634 


B 


1 


384 


MLASPLWLQALSLAAGTWRPRLGSGQAG 
NSEMRAGFI^GAGSQVRAQLQDRLPKTTE 
TKGALWPHTELCGMWSIAPGAENQELQID 
SPLLGQLSNQVWREDGYGKAFRLRTLSSM 
GITEEANENVLI 


2635 


A 


628 


1117 


FFISVINGQVSSVQRLSGVGPACLSCGSANP 

GPPPGTSPGAGAQRR*\PRADGSGSPQWPR 

GARVGGGRLGTGGRGRPGWRQVPRRLSP 

GFGR*GGTGPGPVGTSGKRGPSRRRAPAN 

DKAACWPRFPGQPAS*TGFRGERGVKGFS 

SWGSGWRAWEDGGTVH 


2636 


A 


70 


792 


HGLVLDVRGPLSHAAPYWAPYPAATAAA 
ARTAPLPPRSAIV*/SGPQPDFQELRKTWPS 
QaGMARREPLLPlTAIPRVVVETTP*GFAK 
QEPSVAGLRCRGSEAPA*LLHGVHRNVS/E 
TPGPEMGRPG*GNHRQRPGKQRGLPSSGLP 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^nknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










GRCSGSRGPHSSPGQKPHGSTLSGRRGADP 
RPRRRVYLSTPLLCEKKPHHDTILKRKPGM 
GDGNNPCPWNAGLYGQATRFAPLPLCPRR 
RHGAVS 


2637 


A 


571 


172 


SPLRPLLLALALASVPCAQGACPASADLKH 

SDGTRTCAKLYDKSDPYYENCCGGAELSL 

ESGADLPYLPSNWANTASSLWAPRCELT 

VWSRQGKAGKTHKFSAGTYPRLEEYRRGI 

LGDWSNAISALYCRCS 


2638 


A 


169 


1144 


INYSLEKHVGALGRVLFSL'RAGCPGMGST 

RERGLYIX3KHRGSGGIW* ALAGP *KSRGD 

SVSLTQGHTHVCSRSPR*ADSPPG/SHLSPV 

PHSVEVAGHVLVPATRAAVPCSASAGA*Q 

STYRTGVHQGNPTV*TK/PSRRPSGGVAK* 

FLPSAVRGEPGAKPLVDDLLPGWSLATHG 

QPPLVAAPGSGLWGRPADA*GCETAGGSP 

CPRSTSRPSGPSGVQGCPLG*AGSGASASR 

SEPPGSTSCCPRAP\T*PAAPCVPDWPAGDQ 

WRSHGYLPPSREL*G/WMPPSRPATLPQLA 

FARQRQGNRFDAAFESSGEDFHQMPRVGR 

MG 


2639 


A 


1 


1461 

• 


MRELYSIWLKGYWTEGDWAQSPPRSPREA 

LEGIRVHLRCFKAYGnVLCQCPWNTPLLP 

WKJPGTKHYEPVQDLRLVNQA'IVTLHPTV 

PNPYTLLGLLPAEDIWFTHLDLKDAICSIRI 

APESQKLFAFQWEDLQSGVTTQYTWNWL 

PQGWVLKRVDALFQHLEDCGYKVPKKKS 

QICRQQVRYLGFTIWKGEHSLWSERKQVIC 

SLPEPKTRRQVREFLGAVGFCTLWIPNFAV 

LAKHLYGITKGGNWEPFEWGPLQQQAFLS 

ESPVEHNCVEVLDSVYSSRPDLRDQPWAS 

VDLELYLDGSSFINPQGERCAEYAWTLDA 

V1ETKPLPQGTSAQKAELIALTRALELSEGD 

CIWIKDCNIAPIJ^RWKGPQTVILTTPTAV 

KRSIAIGNWQDDEWLPERITQYYGPATWA 

QYGSWGYYNPIYMLNQMIWLQAVLEITTN 

KTGRALTILAWQETQMRNPTYQDRLALDY 

LLAAEGGVCGKFNLTN 


2640 


A 


254 


418 


MAISWKPTGLPWHSMLQVLLAAWLPGPTP 
TPHSALPSFSPPPSLPPKMCLPKCC* • 


2641 


A 


433 


3 


ASFFNFSICICKDLEVGPPVGHPAHDDVGG 

RHGPGGR/GSRSPRSLQCAPGGGRRSGCPA 

GSSPASTCPPSPGGSGADRFGPSPPPPSREA 

APTAGAAASSTSSGASCPPVPASSRWGVRS 

RTRSGSGGEREPRDRPSERPRLV 


2642 


A 


2 


798 


WEFADVEKKGAGRTEFRYPSYVQHIMGD 

IFSQGFGPFRWVCTSGDPQDLAVTDELATS 

VLEEAIADGVKVSVKLQYMDNIRWIREAA 

RHRLWGSQARILYSDQKGRVAIAVAINQ 

AIACRRIKAPVVLSRDHHDVSGTDSPFRET 

SNIYDGSAFCADMAVQNFVGDACRGATW 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /-possible nucleotide 
de!etion,=possibIe nucleotide insertion) 










VALHNGGGVGWGEVINGGFGLVLDGTPE 
AEGRARLMLSWDVSNGVARRCWSGNQK 
AYEIICQTMQENSTLWTLPHKVEDERVLQ 
QALQL 


2643 


A 


1 


2504 


QISSGRELRVIQESEAGDAGLPRVEVILDCS 

DRQKTEGCRLQAGKECVDSPVEGGQSEAP 

PSLVSFAVSSEGTEQGEDPRSEKDHSRPHK 

HRARHARLRRSESLSEKQVKEAKSICCKSIA 

LLLTDAPM'NSKGVLMFKKRRRRARKYTL 

VSYGTGELEREADEEEEGDKEDTCEVAFL 

GASESEVDEELLSDVDDNTQWNFDWDSG 

LVDIEKKLNRGDKMEMLPDTTGKGALMF 

AKRRERMDQITAQKEEDKVGGTPSREQDA 

AQTDGLRTTTSYQRKEEESVRTQSSVSKSY 

IEVSHGLGHVPQQNGFSGASETANIQRMVP 

MNRTAKPFPGSVNQPATPFSPTRNMTSPIA 

DFPAPPPYSAVTPPPDAFSRGVSSPIAGPAQ 

PPPWPQPAPWSQPAFYDSSERIASRDERISV 

PAKRTGILQEAKRRSTTK^MFTPKEPKVSP 

NPELLSLLQNSEGKRGTGAGGDSGPEEDY 

LSLGAEACNFMQSSSAKQKTP\PPVAPKPA\ 

VKSSSSQPVTPVSPVWSPGVAPTQPPAFPTS 

NPSKGTWSSIKIAQPSYPPARPASTLNVAG 

PFKGPQAAVASQNYTPKPTVSTPTVNAVQ 

PGAVGPSNELPGMSGRGAQLFAKRQSRME 

KYVATDSDTVQAHAARAQSPTPSLPASWKY 

SSNVRAPPPVAYNPfflSPSYPLAALKSQPSA 

AQPSKMGKKKGKKPLNALDVMKHQPYQL 

NASLFTFQPPDAKDGLPQKSSVKVNSALA 

MKQALPPRPVNAASPTNVQASSVYSVPAY 

TSPPSFFAEASSPVSASPVPVGIPTSPKQESA 

SSSYFVAPRPKFSAKKSGVTIQVWKPSWE 

E 


2644 


A 


938 


652 


RSSDGHAAETSRSCQLH*VSRSRNHPGPQP 
SGmi.RVRQSLSPPDSRTLASAILAPP/TPLS 
SFRALALQPQEENRREEEMKEEGQVLGAV 
PLRTS 


2645 


B 


182 


394 


MATHPSLLVCQVGLLGAQVPSVRAGMPQS 

RRQTEGAQGMVRNEEGGSLRLSHHQACK 

ATHTQQWTLEVTAQ 


2646 


B 


1 


591 


MTIHILILLLLLAFSAQGDLDTAARRGQHQ 

VPQHRGHVCYLGVCRTHRLAEIIYWTRCLH 

QGALGEGQPRAPGPLQLWAPPVARGGSPA 

RFPGFRPAARGLAQCPARWVTSGTARPLL 

GFSLPIWLQRDMAEAHQAVGFRPSLTSDG 

AEVELSAPVLQEIYLSGLRSWKRHLSRFW 

VRSGAGRFPSGDPGFCFRDV 


2647 


A 


1 


787 


FQEAAVQLYSHAPHVQLRLKISPGHSPPAL 
GLSFPPGQGRGFSCQLLPASFSWGIPQRPLP 
QREPPGRTRTPAWSCSWGPAIPPVHTLVPA 
PSPGPGADRGGSQGPGLLVQGLPLGSLAP* 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










ALGLPGASADTPVPRRLHSQACCSHGVTG 

*GMG*GDVSPVPVPQGPLGWHLFRVPAGS 

QRSSPIPHQVLGGTRQPLGPGPVRKWTELA 

GDTGDKKEASSPKELVGPQRVGGLAGTVT 

LVPHLCCGRRAPPGGLDGAVEIVA 


2648 


A 


2466 


3395 


KALCPCLPVPLVHGNVEVAGPRSGGACPT 

LGLVVIiWPGh«lAATLRAHGQPCTALWR 

PLKPSPQGYLEGAARGSAAKRPLQRALVS 

LDPGLGVLAATRLPGPVAGGWETQYMCC 

SAAAGSVGCQVAKQHVQDGRKERLEGFV 

KTFEKELSGDTHPG1YALDCEMSYTTYGLE 

LTRVTVVDTDVHVYYDTFVKPDNEIVDYN 

TRFSGVTEADLADTSVTLRDVQAVLLSMF 

SADmiGHSLESDLLAIiCVfflSTVVDTSVL 

FPHRLGLPYKRSLRNLMADYLRQIIQDNVD 

GHSSSEDAGACMHLVTWTC 


2649 


A 


178 


556 


QSPQEHFHPECGRRDILCQVRQEIRWPNPG 
EVHHLGLEICPVWILQLHLALRTRAPEHPL 
QVHRPGGGAV*RGVPPPLRLLQACDGPEV 
PAAGRPRPARSSPGQWPP*/PAAVAPPVTE 
RPPTPSAA 


2650 


A 


803 


1068 


RAMEPLLLGRGL1VYLMFLLLKFSKAIEIPS 
SGKVKTFSAILLSMDSPFQAGGIFGTPPGLG 
SRILSPSPMVSLGSCCTHRSPICFSP 


2651 


B 


1 


559 


MAERAAGGQLPSQGPVQLPSTRKEKDEQT 

ENQQLFFIRQRTESPGKARPPNLETQTSGFQ 

EPQLTGAEPLRGQCHGLELPLMNFWRCHL 

DKTNLRLKEELKAEKKSGFWDNLVLKQNI 

QSKKPDEIEGWEPPKLALEDISADPEDTVG 

GHP S WSGWEDDAKGSTKYTSLASS ANSSR 

WSLRAAGKAX 


2652 


A 


1 


526 


FRLGRKPR*GGVM*PVWSRGEPGSVGAEA 
G/RS*SAPRRLLHHPAAGLATGLSASGRRS 
ARWKMERASGLSPGGGLGATSRQMSPGT 
QLANPPDHGDKDCLGRISPGSGKQIQAAG 
QLPGPPTSLAPAQGRLRSLTPWGLQTPEHS 
BPEGIGHIX)AATEAVLPHSTQNLITKRNLM 


2653 


A 


3 


396 


AAYTLLLHAELLQWSDKPCVPHLLQRDSY 
YVYTQQELKEKLYQEIISYFDKGKMWEKA 
IKXSKElJS£TYESKWDYEGLGNLIjCKRAS 
FYENIIKAMSPQPEYFAVGYYGQGFPSFLR 
NKCFIYRGKEYER 


2654 


C 


1 


507 


MPLTHPNHGPDTLQRWTSSQTPTSLSSKLN 

PEPEADAASILIATSILYKQSDPYLDILARV 

YGPPTAAEENLKCLKEQGQAHLRHFLLCK 

MAAPIAVVXTAAMFENWTHRRQWQVFEP 

GAREEEKSLKSPRFLALKVLRKGADFQRL 

RLYOANMGOAKLPLALFHPLC 


2655 


A 


178 


1206 


ALMNKCAVSTGRQRCSVMWARACSVFCV 

LTLRNTGAQKHWLTEGAAKEHCVSDDSE 

HFESWRAAQLFESVDAEPMNMESQLHFIM 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
co don, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PKALRTKKAASDSSKEQVANSRESSPSPKE 

VNDSPRAATKSPESQNLIDGTKKPSLKQPD 

SPRNISSDNSSKGTPSSPAGSTTAIPKVRIKT 

IKTSSGEIKRTETRVFPEVDLDSGKKPSEQM 

VSVMASVTSLLSSPASAAALSSPPRVPLQS ' 

AWTNAVFPAEPTPKQVTIKPVATAFLPVS 

AVKTAGSQVINIJQ.ANNTTVKATVIPAAS 

VQSASSmKAANAIQEQAVMlVlPASSLANA 

KLVPKTVHLANLNLLA 


2656 


A 


215 


389 


KGAGVLQTFGSSESVFCIDVDRELLIFAYQ 
NILLFLKmRALILETTCFGWVGTVKRT 


2657 


A 


1 


737 


FRGEIAENLPEQDILIQSVCETMVPKLVAED 

IPLLFS LLSDVFPGVQYHRGEMTALREELK 

KVCQEMYLTYGDGEEVGGMWVEKVLQL 

YQITQINHGLMMVGPSGSGKSMAWRVLL 

KALERLEGVEGVAHIEDPKAISKDHLYGTL 

DPNTREWTDGLFTHVLRKIIDSVRGELQKR 

QWIVFDGDVDPEWVENLNSVLDDNKLLTL 

PNGERLSLPPNVRIMFEVQDLKYATLATVS 

RCGMVWFSED 


2658 


B 


41 


166 


MKIAALLGCMMMAARCGTLSAMRDLSFS 
DENRRLAVGTAAAA 


2659 


A 


1 


894 


MPGPMSLWLLLLVLPLSLEHSDLRICFPGQ 

WSMESSSTGFIWTDVRAWQTSNRHVSSW 

REPRHSRMPPGAGLMERIQAIAQNVSDIAV 

KVDQILRHSLLLHSKVSEGRRDQCEAPSDP 

KFPDCSGKVEWMRARWTSDPCYAFFGVD 

GTECSFLIYLSEVEWFCPPLPWRNQTAAQR 

APKPLPKVQAVFRSNLSHLLDLMG SGKES 

LIFMKKRTKRLTAQWALAAQRLAQKLGA 

TQRDQKQILVHIGFLTEESGDVFSPRVLKG 

GPLGEMVQWADILTALYVLGHGLRVTVSL 

KELQR 


2660 


A 


3 


14703 


AAAVSARRAAAGGSRGAGGWGTADASG 

AMAEGGEGGEDEIQFLRTEDEWLQCIATI 

HKEQRKFCLAAEGLGNRLCFLEPTSEAKYI 

PPDLCVOSIFVLEQSLSVRALQEMLANTGE 

NGGEGAAQGGGHRTLLYGHAVLLRHSFS 

GMYLTCLTTSRSQTDKLAFDVGLREHATG 

EACWWT1HPASKQRSEGEKVRIGDDLILVS 

VS SER YLHLS VSNGNIQ VD A SFMQTL WNV 

HPTCSGSSIEEGYLLGGHVVRLFHGHDECL 

TEPSTDQNDSQHRRIFYEAGGAGTRARSLW 

RVEPLRISWSGSNTRWGQAFRLRHLTTGHY 

LALTEDQGLILQDRAKSDTKSTAFSFRASK 

ELKEKLDSSHKRDffiGMGVPEIKYGDSVCF 

VQHIASGLWVTYKAQDAKTSRLGPLKRKV 

ILHQEGHMDDGLTLQRCQREESQAARURN 

TTALFSQFVSGNNRTAAPITLPIEEVLQTLQ 

DLIAYFQPPEEEMRHEDKQNKLRSLKNRQ 

NLFKEEGMLALVLNCIDRLNVYNSVAHFA 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










GIAREESGMAWKEILNLLYKLLAALIRGNR 

NNCAQFSNNLDWLISKLDRLESSSGILEVL 

HCILTESPEALNL1AEGHIKSIISLLDKHGRN 

HKVLDELCSLCLCNGVAVRANQNLICDNL 

LPRRNLLLQTRLINDVTSIRPNIFLGVAEGS 

AQYKKWYFELIIDQVDPFLTAEPTHLRVG 

WASSSGYAPYPGGGEGWGGNGVGDDLYS 

YGFDGLHLWSGRJPRAVASINQHLLRSDD 

VGKLLPGPRGCPASHSASMGSPCRGCLENF 

NTDGLFFPVMSFSAGVKVRFLMGGRHGEF 

KFLPPSGYAPCYEALLPKEKMRLEPVKEY 

KRDADGIRDLLGTTQFLSQASFIPCPVDTSQ 

VIIPPHLEKIRDRLAEMHELWGMNKJELG 

WTFGKIRDDNKRQHPCLVEFSKLPETEKN 

YNLQMSTETLKTLLTLG CHIAH VNP AAE E 

DLKKVKLPKNYMMSNGYKPAPLDLSDVK 

LLPPQEILVDKLAENAHNVWAKDR1KQGW 

TYGIQQDIXNKRNPRLVPYALLDERTKKS 

NRDSLREAVRTFVGYGYNIEPSDQELADSA 

VEKVSIDKIRFFRVERSYPVRSGKWYFEFE 

WTGGDMRVGWARPGCRPDVELGADDQ 

AFVFEGNRGQRWHQGSGYFGRTWQPGDV 

VGCMINLDDASMIFTLNGELLITNKGSELA 

FADYEIENGFVPICCLGLSQIGRMNLGTDA 

STFKFYTMCGLQEGFEPFAVNMNRDVAM 

WFSKRIPTFVNVPKDHPHIEVMRIDGTMD 

SPPCLKVTHKTFGTQNSNADMIYCRLSMP 

VECHSSFSHSPCLDSEAFQKRKQMQEILSH 

TTTQCYYAERIFGGQDPSCVWVGWVTPDY 

HLYSEKFDLNKNCTVTVTLGDERGRVHES 

VKRSNCVMVWGGDIVASSQRSNRSNVDL 

EIGCLVDLAMGMLSFSANGKELGTCYQVE 

PNTKVFPAVFLQPTSTSLFQFELGKLKNAM 

PLSAAIFRSEEENPVPQCPPRLDVQTIQPVL 

WSRMPNSFUCVETERVSERHGWWQCLEP 

LQMMALHIPEENRCVDILELCEQEDLMRF 

HYHTLRLYSAVCALGNSRVAYALCSHVDL 

SQLFYAIDNKYLPGLLRSGFYDLLISIHLAS 

AKERKXMMKMIYIIPITSTTRNICLFPDESK 

RHGLPGVGLRTCLKPGFRFSTPCFWTGED 

HQKQSPEPLESLRTKALSMLTEAVQCSGA 

HIRDPVGGSVEFQFVPVLKLIGTLLVMGVF 

DDDDVRQILLLEDPSVFGEHSAGTEEGAEK 

EEVTQVEEKAVEAGEKAGKEAPVKGLLQT 

RLPESVKLQMCELLSYLCDCELQHRVEAIV 

AFGDIYVSKLQANQKFRYNELMQALNMS 

AALTARKTJCEERSPPQEQINMLLNFQLGEN 

CPCPEEIREELYDFHEDLLLHCGVPLEEEEE 

EEEDTSWTGKLCALVYKIKGPPKPEKEQPT 

EEEERCPTTLKELISQTM1CWAQEDQIQDSE 

LVRMMFNLLRRQYDSIGELLQALRKTYTIS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (X=Unkno\vn, *=Stop 
codon, /possible nucleotide 
deletion,=possible nucleotide insertion) 










HTSVSDTINIIJ^LGQIRSLLSVRMGKEEE 

LmiNGLGDIMNNKVFYQHPNLMRVLGM 

HETVMEVMVNVIXjTEKSQIAFPKMVASCC 

RFLCYFCRISRQNQKAMFEHLSYLLENSSV 

GLASPSMRGSTPLDVAASSVMDNNELALS 

LEEPDLEKWTYLAGCGLQSCPMLLAKGY 

PDVGWNPIEGERYLSFLRFAVFVNSESVEE 

NASVWKLLIRRPECFGPALRGEGGNGLLA 

AMQGAIKISENPALDLPSQGYKREVSTEDD 

EEEEEIVHMGNAIMSFYSALIDLLGRCAPE 

MHLIQTGKGEAIR1RSILRSLVPTEDLVGIISI 

PLKLPSLNKDGSVSEPDMAGNFCPDHKAP 

MVIJFLDRVYGIKDQTFLLHLLEVGFLPDLR 

ASASLDTVSLSTTEAALALNRYICSAVLPL 

LTRCAPLFGGTEHCTSLIDSTLQTIYRLSKG 

RSLTKAQRDTEEECLLAICNHLRPSMLQQL 

I^RLVFDWQLNEYCKMPLKLLTNHYEQC 

WKYYCLPSGWGSYGLAVEEELHLTEKLF 

WGHDSLSHKKYDPDLFRMALPCLSAIAGA 

LPPD YLDSRITATLEKQISVD AD GNFDPKPI 

NTMNFSLPEKLEYWTK Y AEH SHDKW ACD 

KSQSGWKYGISLDENVKTHPLIRPFKTLTE 

KEKEIYRWPARESLKTMLAVGWTVERTKE 

GEALVQQRENEKLRSVSQANQGNSYSPAP 

LDLSNWLSRELQGMVEWAENYHNIWA 

KKKKLELESKGGGSHPLLVPYDTLTAKEK 

FKDREKAQDLFKFLQVNGIIVSRGMKDME 

LD ASSMEKRFG YKFLKKLLKYVD S AQEFI A 

HLEAIVSSGKTEKSPRDQEKFFAKVLLPLV 

DQYFTSHCLYFLSSPLKPLSS SGYASHKEK 

EMVAGLFCKIAALVRHRISLFGSDSTTMV 

SCIJIIIAQTLDTRTVMKSGSELVKAGLRAF 

FENAAEDLEKTSENLKLGKFTHSRTQDCGV 

SQNINYTTVALLPILTSIFEHVTQHQFGMDL 

LLGDVQISCYHILCSLYSLGTGKNIYVERQ 

RPALGECLASLAAAIPVAFLEPTLNRYNPL 

SVFNTKTPRERSILGMPDTVEDMCPDIPQL 

EGIJNdKJEINDLAESGARYTEMPHVmVILPM 

UmLSWWERGPENLPPSTGPCCTKVTS 

EHI^IJLGNILKIINNNLGIDEASWMKRIAV 

YAQPnSKARPDLLRSHFIPTLEKLKKKAVK 

TVQEEEQLKADGKGDTQEAELLILDEFAV 

LCRDLYAFYPMLIRYVDNNRSNWLKSPDA 

DSDQLFRMVAEVFELWCKSHNFKJREEQNF 

VIQNEINNLAFLTGDSKSKMSKAMQVKSG 

GQDQERKKTKRRGDLYSIQTSLIVAALKK 

MLPIGLNMCTPGDQELISLAKSRYSHRDTD 

EEVREHLRNNLHLQEKSDDPAVKWQLNL 

YKDVLKSEEPFNPEKTVERVQRISAAVFHL 

EQVEQPLRSKKAVWHKLLSKQRKRAWA 

CFRMAPLYNLPRHRSINLFLHGYORFWIET 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










EEYSFEEKLVQDLAKSPKVEEEEEEETEKQ 

PDPLHQIILYFSRNALTERSKLEDDPLYTSY 

SSMMAKSCQSGEDEEEDEDKEKTFEEKEM 

EKQKTLYQQARLHERGAAEMVLQMISAS 

KGEMSPMWETLKLGIAILNGGNAGVQQK 

MLDYLKEKKDAGFFQSLSGLMQSCSVLDL 

NAFERQNKAEGIXjMVTEEGTLIVRERGEK 

VLQNDEFTRDLFRFLQLLCEGHNSDFQNFL 

RTQMGNTTTVNVnSTVDYLLRLQESISDFY 

WYYSGKDIIDESGQHNFSKALAVTKQ1FNS 

LTEYIQGPCIGNQQSLAHSRLWDAWGFL 

HWANMQMKLSQDSSQIELLKELLDLLQD 

MVVMLLSLLEGNVVNGTIGKQMVDTLVE 

SSTNVEMILJCFFDMFLKLKDLTSSDTFKEY 

DPDGKGnSKKEFQKAMEGQKQYTQSEIDF 

LI^CAEADENDMFNYVDFVDRFHEPAKDI 

GFNVAVLLTNLSEHMPNDSRLKCLLDPAE 

SVLNYFEPYLGRIEIMGGAKKIERVYFEISE 

SSRTQWEKPQVKESKRQFIFDWNEGGEQ 

EKMELFVNFCEDTIFEMQLASQISESDSAD 

RPEEEEEDEDSSYVLEIAGEEEEDGSLEPAS 

AFAMACASVKRNVTDFLKRATLKNLRKQ 

YRNVKKMTAKELVKVLFSFFWMLFVGLF 

QLLFTILGGIFQILWSTVFGGGLVEGAKNIR 

VTKILGDMPDPTQFGIHDDTMEAERAEVM 

EPGITTELVHFIKGEKGDTDIMSDLFGLHPK 

KEGSLKHGPEVGLGDLSEIIGKDEPPTLEST 

VQKKRKAQAAEMKAANEAEGKVESEKAD 

MEDGEKEDKDKEEEQAEYLWTEVTKKKK 

RRCGQKVEKPEAFTANFFKGLEIYQTKLLH 

YLARNFYNLRFLAIJVAFA1NFILLFYKVTE 

EPLEEETEDVANLWNSFNDEEEEEAMVFF 

VLQESTGYMAPTLRALAQHTIISLVCVV GY 

YCLKWLVVFKREKEIARKLEFDGLYITEQ 

PSEDDIKGQWDRLVINTPSFPNNYWDKFV 

KRKVINKYGDLYGAERIAELLGLDKNALD 

FSPVEETKAEAASLVSWLSSIDMKYHIWKL 

GVVFTDNSFLYIAWYTTMSVLGHYNNFFF 

AAHLLDIAMGFKTLRTILSSVTHNGKQLVL 

TVGLLAVVVYLYTVVAFNFFRKFYNKSED 

DDEPDMKCDDMMTCYLFHMYVGVRAGG 

GIGDEIEDPAGDPYEMYRIWDITFFFFVTVI 

LLAIIQGLIIDAFGELRDQQEQVREDMETK 

CFICGIGNDYFDTTPHGFETHTLQEHNLAN 

YDFFLMYLINKDETEHTGQESYVWKMYQE 

RCWDFFPAGDCFRKQYEDQLG 


2661 


C 


54 

> 


350 


MLNSSEQRRPHGVLDSVWPGIHGALCAGR 
WIJITGQI^WDTRHMIJUIKMVSSSEPQRP 
PTSWSWCCXASTVRPLLVDGSGWGSCRGR 
PAACWKEDGQFF 


2662 


A 


50 


646 


SSALLSSNQTASFGSCSLSLPCSARERTPEG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possibIe nucleotide 
deletion,=possible nucleotide insertion) 










GGWPGGRLSEPLPAMLLLWVSVVAALAL 
AVLAPGAGEQRRRAAKAPNVVLVVSDSFD 
GRLTFHPG SQ VVKLPFINFMKTRGTSFLNA 
YTNSPICCPSRAAMWSGLFTFILTESWNNF 
KGLDPNYTT WMD VMERHG YRTQKF GKL 
DYTSGHHSIRHSERGSTNQRSEKV 


2663 


B 


44 


293 


MPVWWRRRRLRARSWALRARPLSLPRAQ 
RSGRLLRRPKGYAPGAPKAHELSPQAICAV 
AFX 


2664 


C 


40 


495 


MVE^ALQRRAFLCAANVKIPRLRIKVKTK 
EAS AQ V VKEECNKYLLFLLPVP S AGLLP SI 
MEIADPFSSFGSEDKCYTLTPPLPRHTEICSS 
D S QEKGHFE AG VEPKSRG STPG Q YPGIGCF 
ARFREYQIGMRHLTTRPAMHRAQVLFPLS 
F 


2665 


A 


587 


2 


FLTRETGDPTGRSSSHANTQSRFFPDDPPG\ 

PLNNLGNTHGCGRRAGRCPGTGPDGPVAG 

CGGPRCWPSGHLAATGD *GPSCGRLGANR 

GEAGPAGFTACSPLSGCRTPYTHHFPASRM 

SCHLNCASPRTYRSQGNRGCERVAQGSQG 

AGGERGAKSQVPVPAPARNKDPAKCRKPR 

NRRPGNSGPWRAYRRQR 


2666 


A 


1 


1853 


RARRLALQCHVCVCALTPGEQSGRRLPGQ 

TWLMFSCFCTSLQDNSFSSTTVTECDEDPV 

SLHEDQTDCSSLRDENNKENYPDAGALVE 

EHAPPSWEPQQQNVEATVLVDSVLRPSMG 

NFKSRKPKSIFKAESGRSHGESQETEHVVS 

SQSECQVRAGTPAHESPQNNAFKCQETWR 

L\QPRIDQRTATSPKDAFETR\QDLNEEEAA 

QVHGVKDPAPASTQSVLA\DGTDSADPSPV 

HKDGQNEADSAPEDLHSVGTSRLLL/YHIT 

DGDNPTAVRHGCSL/FSGQSQRFNLDPESA 

PSPPSTQQFMMPRSSSRCSCGDGKEPQTIT 

QUIXfflQSLKRKIRKFEEKFEQEKKYRPSH 

GDKTSNPEVLKWMNDLAKGRKQLKELKX 

KLSEEQGSAPKGPPRNLLCEQPTVPRESGK 

PEAAGPEPSSSGEETPDAALTCLKERREQL 

PPQEDSKXnTKQDKNUKPLYDRYRIIKQILS 

TPSLIPTIQEEEDSDEDRPQGSQQPSLADPA 

SHLP VGDHLTY SNETEP VRA LLP DEKJCE V 

KPPALSMSNLHEATMPVLLDHLRETR\DK 

KRLRKALREFEEQFFKQTGRSPQKEDRIPM 

ADEYYEYKHIKAKLRLLAPPAGSYFP 


2667 


C 


147 


398 


MYKAQFLAASPGRCLGLLAASNHHAKSIH 
GFRRLVKTMRNRLCS LCQPFPLPKHLLS LS 
WFGDQGHTSQYFTLSTQRNEAQLQ 


2668 


A 


1 


1787 


MSKGESRKCNEENVSKSSKWKVFIVLTPQ 

FLSRDKDQLTKELQKHVKSVTVSCKSPRK 

LLSHITRLHPPSKGQGEKLTHLVDSIKATIW 

CQPVWETVEGQRRRVGNCIDFTNGCDLVG 

SSSLHNMLVCSSYDINRODTFQKDRTSEKH 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LLDSVFTALQDSAGQQWPARLHPQRGEEV 

ADPRGAPSRHVEPENSSPCQGNGEQAGKA 

GARALCXjQARRSPATMPPPLTTRSLCEFAV 

FLLHWLFPELFHYRKLGEQDSCYGDGGKQ 

EIJ)PQRLQnCOTTEVYFPHMQEEEAWRQA 

GPGPAEAAD/TSATSRRSTSPTCRRRRPGCS 

GAPSASTTSFRAWGWTQAAKASPPRDNCY 

NSSSLPDDISIJTHDNLHKQHSCSDSLGKK 

QDDPSCIKIJRH*VHLLYLCTKNNRVWTLE 

FMGNIJHWNRl^GAPTSSSARSTCWPRV *R 

HEELCNQS*EVQRGV *GSPAAPERSSKDFC 

KIPLDEWVPH*/DFPVRSPYLLSDKEVCKI 

VQQSLSVGNFAAGLL/LPPRTSSCSTTIFGL/ 

DNKKQLDPTQLRL1CH* VEA VYP VEKVEE 

VWHCECIPSNDEQCHCTNRKXCNILKKAK 

KVEK 


2669 


A 


14 


425 


RRFREPDAQMLEIPNLTPYTHYRFRMKQV 
NIVGPSPYSPSSRVIQTLQAPPDVAPTSVTV 
RTASETSLRLRWVPLPDSQYNGNPESVGY 
RJKYWRSDLQSSAVAQWSDRLEREFTffiE 
LEEWMEYELQMQAFNAVG 


2670 


B 


1 


825 


MRALKLQRRKSFWIVVAWEAFVQLVNYE 

CKVGEWKGLAHCVSQNNKYRTTYIIAGVP 

NPQEPGYTAGGQLKGNDLTVLHLLV1EGK 

WEAVRKFPFKKYIVNTATVKEARKYWVEE 

GSSLAKATRSNPGYLQPYMRTGIPVFAPPK 

LPFGPPCPLSCTHINPKPQAPEADQQLPIHL 

AESHFHHSIKPRIHPSSPCVTRFFLDAEREL 

GIQKAVPWSFTLVKKQKSLGLPSVQDFGS 

VYKMNIWSDVACCDPQLQQPAASAQTSAI 

SQLSRVTES , 


2671 


B 


475 


848 


XRTERVHLRITPGDDSRKRSSASHYRVAGI 
SRLTLSLDREQLYLEQSTEGPEQDKREGKS 
ARSSSREPTGQPRTLXGGMRARKRJCTLVL 
GPFPRVISGSNAKMDTLSPACACAFALYGI 
PKPAA 


2672 


A 


3 


765 


LGTVSYGADTMDEIQSHVRDSYSQMQSQA 

GGNNTGSTPLRKAQSSAPKVRKSVSSRIHE 

AVKAIVLCHNVTPVYESRAGVTEETEFAE 

ADQDFSDENRTYQA SSPDEVALVQWTES V 

GLTLVSRDLTSMQLKTPSGQVLSFCILQLFP 

FTSESKRMGVIVRDESTAEITFYMKGADVA 

MSPIVQYNDWLEEECGNMAREGLRTLW 

AKKA.LTEEQYQDFnVSRLPGIPobY/lJKOLr 

YAFTSSSCLCMKLELGSL 


2673 


A 


9 


413 


EFKSnQHKQSIVELKLQAEDSFVLKVVQL 

EELLQVRHSVFIVGNAGSGKSQVLTLASNE 

RIPLNRTMRLVFEISHLRTATPATVSRAGIL 

Y1NPADLGWNPWSSWIERRKVQSEKANL 

MILFDKYLPTCLDK 


2674 


A 


379 


17 


SWGVWYKYOPLDLVRRYFGEKIGLYFAW 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /=possib!e nucleotide 
deletion,=possible nucleotide insertion) 










LGWYTGMLFPAAFIGLFVFLYGVTl'VDHS 
QVSKEVCQATDITMCPVCDKYCPFMRLSDS 
CVYAKVTHLFDNGATVFSAVFMAVWATV 
LMEFGK 


2675 


A 


1 


1833 


MVD SLIARVG VMARGNAITLP VCGRD VKF 

TLEVLRGD S VEKTSRVWSGNERDQELLTE 

DALDDLIPSFLLTGQ QTP AFGRRVSG VIEIA 

DGSRRRKAAALTESDYRVLVGELDDEQM 

AALSRLGND YRPTS AYERG QRYASRLQNE 

FAGNISALADAENISHSDKFDANDPILKDQ 

TQEWSGSATFTSDGKIRLFYTDYSGKHYG 

KQ SLTTAQ VNVSKSDDTLKJNGVEDHKTIF 

DGDGKTYQNVQQFIDEGNYTSGDNHTLRD 

PHYVEDKGHKYLVFEANTGTENGYQGEES 

LJFNKAYYGGGTNFFRKESQKXQQSAKKRD 

AELANGALVNTQSTTTRRPGSNSLSHLMW 

PVDHQKFQSVTEMCGSILSRDFADFGTTIK 

QDFRLLGQTSVDRLLQLSQGQAVKGNQLL 

PVSLVKRKTTLAPNTQTASPRALADSLMQ 

IARQVSRLESGQDFADFGTTIKQDFRLLGQ 

TSVDRLLQLSQGQAVKGNQLLPVSLVKRK 

TTLAPNTQTASPRALAD SLMQLARQ VSRL 

ESGQDFADFGTTIKQDFRLLGQTSVDRLLQ 

LSQGQAVKGNQLLPVSLVKRKTTLAPNTQ 

TASPRALADSLMQLARQVSRLESGQ 


2676 


B 


1 


309 


MGKAMLQLLIRAHWTVFPCEHEDNAASV 
SVTLCSDLAGGEWSAVLTGQSWQTEKJEI 
DRSSKPPACLVAPQWFCSEVLRVDESYHR 
KYPVOLRPVHIAAK 


2677 


A 


2 


179 


RGKKSVTTVAGPMAQDVESLALCLQALLS 
EDMYRLDPTVLOMPFREEVKTPFPTPGCSE 


2678 


A 


34 


390 


MKRRRQLRARVF AL ALAWSLGPCW ALRV 
AVPKASXTIRGPQRRLLASLLQENTEILGY 
LLGSVAAFGSWASRIPPLSRICRGKTFPSIH 
LWTRLLS AL AGLL YAS AIAAHDRHPEYL L 
R 


2679 


A 


568 


3 


SYYERINRQLIEAKMALQDREEKMEKVFD 

D IETNIvINLIG ATA VEDKLQDQ AAETIE ALH 

AAGLKVWVLTGDKMETAKSTCY ACRLFQ 

TNTELLELTTKTIEESERKEDRLHELL1EYR 

KXIXHEFPKSTRSFKJKAWTEHQEYGLiroG 

STLSLILNSSQDSSSNNYKSEFLQICMKCTA 

VLCCRMAPL 


2680 


A 


3 


394 


SSRWAFQVLSPSADSARLPGRAPGDRDCTF 

QPSAPAPSKPFLLSTPPFYSACCGGSCRRPA 

SSTAFPREESMLPLLTQDSNSKARRGILRR 

AVFSEDQRKALEKMFHKQKYISKTDRKKL 

AINLGLKESQ 


2681 


A 


42 


406 


EPGDPREGEEEEEEDEPDPEAPENGSLPRFV 
PRFNFSLKDLTRFVDFN1KGROV1VFLHIQK 
TGGTAFGRHLVKNIRLEOPCSCKAGOKKC 
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Table 8 



SEQ 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 



Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 



Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 



TCHRPGKKETWLFSRFSTGWSCGLHADWT 



2682 



10 



932 



LQLCSMWLLRSWVQAEGAVSISDSPFSLH 
QCWAVIilKAWCVTLQLPGGFTFITWLSD 
NLLGKRVD S APS WGPLGS AFRGVHMPCV 
GAAWEGKGPNLLRPSGKLGPSGSRPTPIGQ 
QQLPEVPRAKGPLGPAAVICQ/HMPAPSTG 
GKRGSFSGRYLSASLELGGLPMAPTGP SAL 
SAPPSVSRGAR*STREKPGVYASAT*AAEIR 
EGQALGGVPRPSRNG/SGGPLGPDFGPNGPK 
LRRSKAGCPWWHLSSVDAGE*LWKQHST 
AVFSMPGTQPPWRGLTTMPISPRGTEPTAH 
PGPRSPGLAYSLTA 



2683 



416 



NRLTTHSPHSPGPGGRQAPWRRQCRPASC 
PAKSTTWPVTRAPTRPPAWPPPASAPP/RY 
LLEEWFQNCYARYHQAFADRDQSERQRH 
ESQQLATETQALAQRTQQDSTRTVGERLQ 
DTHSWKSELQREMEALAAETNLLL 



2684 



356 



1356 



TPTTSGRTRKMWPRPGT*PP/ANCSANINLT 
HQPWFQVLEPQFRQFLFYRHCRYFPMLLN 
HPEKCRGDVYLLVVVKSVITQHDRREAIR 
QTWARAAVRGWGPSAVRTLFLLGTASKQ 
EERTHYQQLLAYEDALYGDELQWGFLDTF 
FNLTLKEIHFLKWLDIYCPHVPFIFKGDDD 
WVNPTOLLEFLADRQPQEIS^^ 
ARPIRRKDNKYYTPGALYGKASYPPYAGG 
GGFLMAGSIARRLHHACDTLELYPIDDVF 
LGMCLEVLGVQPTAHEGFKTFGISRNRNSR 
MNKEPCFFRAMLVVHKLLPPELLAMWGL 
VHSNLTCSRKLQVL 



2685 



741 



2686 



396 



687 



VRSMSCPPSWPYCAPCPTNIGESTSPLRKTI 
ETPTLWDPKAPSCSLELPPWVLASPQRSRG 
TALPFLPSNVLPSLALPSTSFLCRPLLSHLV 
TSLLAGPGAHD GHLRKEG WRSTPEMTSLP 
APEHPASPCDSVLCSPDVSMCTLGPAARW 
DAQAKSAPLPPCCTDCKSFPHLQRPWAQP 
HTSQ ATS VDS GE AGTKGM SQFTVWTWWR 
SRPCETRQGEGIGNWGYSVTPGPPGSQNLP 
ARLDGQGLAS 



TFCPRCGCPSGLAMRLFLSLPVLVVVLSIV 
LEGPAPA*GAPEVSNPFDGLEELGKTLEDY 
TREFINRrrQSELPAKMWDWFSETFRKVKE 
KLKTDS 



2687 



3794 



PRGPRPGASGS AM WLSPEEVLVAN ALWVT 

ERANPFFVLRRRRGHGRGGGLTGLLVGTL 

DWLDSSARVAPYRILHQTQDSQVYWTVA 

CG SSRKEITKHWE W LENNLLQTLS1FD SEE 

DITTFVKGKIHGI1AEENKNLQPQGDEDPG 

KFKEAELKMRKQFGMPEGEKLVNYYSCS 

YWKGRWRQGWLYLTVNHLCFYSFLLGK 

EVSLWQWVDITRLEKNATLLFPESIRVPT 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
cod on, /=possibIe nucleotide 
deletion,=possible nucleotide insertion) 








✓ 


RDQELFF SMFLNIGETFKXMEQLANLAMR 

QLLDSEGFLEDKALPRPIRPHRNISALKRDL 

DARAKNECYRATFRLPRDERLDGHTSCTL 

WTPFNKLHIPGQMFISNNYICFASKEEDAC 

HLIIPLREVTIVEKADSSSVLPSPLSISTKSK 

MTFLFANLKDRJDFLVQRISDFLQKTPSKQP 

GSIGSRKASWDPSTESSPAPQEGSEQPASP 

ASPLSSRQSFCAQEAPTASQGLLKLFQKN S 

PMEDLGAKGAKEKMKEESWHIHFFEYGR 

GVCMYRTAKTRALVLKGIPESLRGELWLL 

FSGAWNEMVTHPGYYAELVEKSTGKYSL 

ATEEIERDLHRSMPEHP AF QNELGIAALRR 

VLTAYAFRNPTIGYCQAMNIVTSVLLLYGS 

EEEAFWLLVALCERMLPDYYNTRVVGAL 

VDQGIFEELTRDFLPQLSEKMQDLGVISSIS 

LSWFLTLFLSVMPFESAVVTVDCFFYEGDC 

VILQVALAVLDANMEHLLGCSDEGEAMT 

MLGRYLDNVVNKQSVSPPIPHLRALLSSSD 

DPPAEVDIFELLKVSYEKFSSLRAEDIEQMR 

FKQRLKVIQSLEDTAKRSVVRA1PVDIGFSI 

EELEDLYMVFKAKHLASQYWGCSRTMAG 

RRDPSLPYLEQYRCDASQFRELFASLTPWA 

CGSHTPLLAGRMFRLLDENKDSLINFKEFV 

TGMSGMYHGDLTEKLKXTLYKLHLPPALSP 

E\EAE\SALEATHLFSQRDSSSEASPLASDLD 

LFLPWEAQEALPQEEQEGSGSEERGEEKGT 

SSPDYRHYIJmWAKEKEAQKETIKDLPK 

MNQEQFIELCKTLYNMFSEDPMEQDLYHA 

IATVASIXLMGEVOKKxb AK 1 UKJsi'KLJUA 

TEEDEPPAPELHQDAARELQPPAAGDPQA 

KAGGDTHLGKAPQESQVWEGGSGEGQG 

SPSQLLSDDETKDDMSMSSYSWSTGSLQC 

EDLADDTVLVGGEACSPTAR1GGTVDTDW 

aSFEQELASILTESVLVNFFEKRVDIGLKIK 

DQKKVERQFSTASDHEQPGVSG 


2688 


B 


119 


682 


GDKG ADEREISGGTDTAAAAQLK1H Y w Ir 

GPSTVQEHKEVFNTKLADGQNGSPSKQASI 

CDRQFWAGGYHRSLADEAYGDEEDLPK 

WGL VH STRGP AHPTYLLRPLQKDQD SSL 

LRASGGGGGSPSSSTKSEHSCRQIHIPGPFS 

KADITGQKWFPGGVSTl^ARNMGFLKPTP 

TPLLRSPKDFR 


2689 


B 


1 


3097 


MAGARVGPAAGARTAVPAAGEVPASPAL 

TDTQKGTGIGHWVVAVAPTIQTSVWPKPF 

RGNRISVLGFEPHSLVSADPQQSQYPYFLFP 

EPPSPKPLSMLEDSYASLKIQASARAPPLSPI 

DMDKQERIKAERKRLRNR1AASQVPQAQA 

GAHLAPGKKVKTLKSQNTELASTAACCAS 

SSSLVGGSRERVSESGPHICAQRAPPRRAL 

ARGRLMPGDTGPRELHRNP S WW VCLLV 

SLLLIGSWMAVRFCHRNESKFEMLDEVS 
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Table 8 



SEQ 

TO 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










MGSVNDRLSFAHHLQEHQFLFPRVAGCRA 

RGTPATPAALGRCWPWPLRPPCPASQRQK 

VAVGPKRMGPSPFRLAATVRQPERPQAPM 

AVPSCPSTPDYENMFVASQQPSTSGMNKG 

KALPAGILQMVTDTSRPNVGGDESLDCLV 

LNRISYTCRSTLSPRPSFSAPGREESGSVMA 

PDDSMGIMRSLGGLSRLTVAA1VRDVTKFC 

DPGPPHPALQETPQMAPSPGAPQPLNPPAP 

PRKRNTASAPVHLRAARDDSEAALYPFLQ 

VS YSLSGHKNNYTYYAW V VGGFRALGYK 

HSTDVCSGVTIQEEMWIRHRFLRAAPISQR 

TRHYHRFLGCSMAGSGCASDLLCCDWRD 

SCCRCSJLSAAQATPLSSPRPRPRSAARLSAR 

GAATTAGSVCSGGGEVAGEPGPRRHHVG 

GAEKWGDVQWTPGDCDNWMNINLREVIC 

TSGTGQV1JVDARVIJIPRQHHQYLRJPDEII 

DMVKEEVGPRAAAAEACSSRSSRKPRHGR 

RWPRCFGALSCCGGRESDSTCCKPLPFADP 

QVLHAPEKGVWEAGSRTRPRERAPRSVCP 

GSGPGPGVEATARSCRAGGAEAVEGGTGA 

QASMVNTTGYWARPLQATQGGSAAWQQ 

WGTREASPDDTTTRGLTGAKPESTNSQNH 


2690 


A 


1007 


537 


SRKGSSLAAHPLSPSRLSAVPTAGGGGDSE 

AKPHLVSPGGSEGAIWCGHGQGRGGSGND 

RGGQ\GPGAGGRRGIPTPARGAVIYKTQRR 

EEEGTRGCNQLASLSGPQGATVSPSSGGSS 

PGTCCDRHPIJRADTRMMVWGQEPSPSLVC 

FPKLQPDSL 


2691 


A 


1 


1656 


METEPSKAKANDPGSAAEGVVFASISSGLG 

EVTFLSLTAFYPRAVTSWWSSGTGGAGLLG 

ALSYLGLTQAGLSPQQTLLSMLGJDPALLLA 

SILRKALDK1AEIKSLLEERRIGHKYLGLRY 

OPPLYVLYTDAFWSVTPYSEVHIAFTILEEV 

SLCDSKLIHIIFVRLAYACPRFTVSAWAASI 

PEYMVRISLLTAQVDMTIIG1AFMPCPRPL 

MPTVAPTAAREMG VHHTGD SAGEKLHRA 

CCGRGRLCREHRVLALPLSSTLPYRDCAPG 

CILHFPPFVHRYEVDDIDEEGKARHTVSLR 

RnPLTRWKANPETDPEALL VKEKTMFS GC 

CNLGDSTANTGSLGNTAKWARVPNYTNM 

QRLVVAPNVGLRCYLLDTRLKGQGKECES 

PPMIGLRSICMHTKKRVSSFRGNKIGLKDVI 

TUUlHVETKVRAKIRKRK\aTKJN^ 

GKRKTARKQLSLSPCSQCLNLVFLLADVW 

FGFLPSIYLVFr TI LYEGLLGGA A YVNTFTTM 

IALETSDEHREFAMAATCISDTLGISLSGLL 

ALPLHDFLCOLS 


2692 


B 


1 


678 


MKTUJUIASRFLALPRTSFNALSKSHNLLG 
FKDTRSNVEALAQKTQPSVFPKESVQVTPV 
CYTKGDRESVQKCPLIFRSHSATEQVSIRR 
GVIVRVAKWRGESH1HGGPDVPGLVLDTS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deIetion,=possibIe nucleotide insertion) 










YETSPLfflTPALRVYYIGEDIAMEQVTNLA 
FPLLYSNSHRVSEPGELGFWGPGESVMPA 
DAVSVPCTCHPGSYGVQPLVLRIQGYSGT 
GRWISASAMSCHSDRNG 


2693 


A 


22 


334 


ALKHFCLCSLIFSVTTMKFLAVLVLLGVSIF 
LVSAQNPNTTAAPADTVSSLLVLLMMKPLD 
AETTAAATTATTAAPTTATTAASTTARKDI 
PVLPKWVGDLPNR 


2694 


A 


3 


435 


R\0)PRVRAPRCGDKIKNHMY\KCDCGSLK 

DCASDRCCETSCTLSLGSVCNTGLCCHKC 

KYAAPGWCRDLGGICDLPEYCDGKKEEC 

PNDIYIQDGTPCSAVSVC1RGNCSDRDMQC 

QALFGYQVKDGSPACYRKLNRIGNRFGT 


2695 


A 


120 


1438 


TMNSEDTLRQNLLMGYRQHQAILTAHSTG 

PRRPAHQSSAEGSLVPCSGN\PVPPKG*LW 

ARQGPAEVSGAGKIPASPKTGFPFLFLSSH 

WKLEKGYSPCAQAGCSKGQGLSPQPYLKV 

LIILGYQA*KGS*FFGPSPPSRKVFPSMGTG 

PQRRKFS*PRFPEGLN*PDCGPGTEPPLGCG 

CRGLS*VPRSGREKRAMADP*SQLGGSQL 

GGDFS/*GPEAGRL*VGAQQGPPGVRNRH * 

SPLLTSS*R/PKARSPDESRGKP QSPLPMMS 

LLP/RGGPSGPHLGPPLEHLPPAPSTPLQNP 

GPQSMV\GPHSDFYPLPVSPWGSRRLQPTQ 

LCXPDSKLPGASPPGSAKMAAGQVRWNG j 

WAR/PTPPGN*PPSSPPGADPLLSQLDPLRP 

LKWLPSLQFFPKGCGLGCLCPGPPASERSV 

LSPAPG\PGLVGVLGEQGVARTPGGR 


2696 


A 


2 


454 


SGHGSSSGTKSSKKKNQNIGYKLGHRRAL 
FEKRKRLSDYALIFGMFGIWMV1ETELSW 
GAYDKASLYSLALKCLISLST11LLGLIIVYH 
AREIQLFMVDNGADDWR1AMTYERIFFICL 
EILVCAIHPIPGNYTFTWTARLAFSYAPST 


2697 


A 


506 


1317 


GRTSSGKAGMWKPGAESWPLHTGAAQY 

MWFEKLYAGLQCVEKYLIYPAWLNALT 

VDAHTWSHPDKYCFYCRALLMTVAGLK 

LLRSAFCCPPQQYLTLAFTVLLFHFDYPRL 

SQGFLLDYFLMSLLCSKLWDLLYKLRFVL 

TYIAPWQITWGSAFHAFAQPFAVPHSAML 

FVQALLSGLFSTPLNPLLGSAVFIMSYARPL 

KFWERDYNTKRVDHSNTRLVTQLDRNPG 

ADDNNLNSEFYEHLTRSLQHTLCGDLVLGR 

WGNYGPGDCF 


2698 


A 


86 


820 


MACYLLVAN1LLVNLLIAVFNNTFFEVKSIS 

NQVWKFQRYQLIMTFHERPVLPPPLUFSH 

MTMIFQHLCCRWRKHESDPDERDYGLKLF 

ITDDELKKVHDFEEQCIEEYFREKDDRFNS 

SNDERIRVTSERVENMSMRLEEVNEREHS 

MKASLQTVDIRLAQLEDLIGRMATALERL 

TGLERAESNKIRSRTSSDCTDARLHWPVRA 

ALTSQEREHLSAPKRGLEPWQNILFIQYKP 
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Table 8 



SEQ I Method 

m 

NO: 



Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 



Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 



Amino acid sequence (X^Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possibIe nucleotide insertion) 



AASSST* 



2699 A 



2700 B 



2701 A 



2702 A 



2703 A 



553 



KASVIVIISDVKPFKCKLCGKEFNKMHNLM 
GHMHLHSYSKPFKCLYCPSKFTLKGNLTR 
HMKVKHGVMERGLHSQGLGRGRIALAQT 
DGVLRSLEQEEPFDLSQKRRAKVPVFQSD 
GESAQGSHCHEEEEEDNCYEVEPYSPGLAP 
QSQQLCTPEDLSTKSEHAPEVLEEACKEEK 

EDASKGEW 



123 



719 



MTEEEEWKPMDPSKMRCSFFQNGKESEKE 
K VPTRSLLAQ V1IPLVNYRGD G SD ATLQN A 
DPFVGKAGLGFVDDSPLKEVRCQRGLMD 
KVHKSVCEKTKKGEAVPALCILILDNPSSC 
YQPFLAYPRYVKPSSEPSILPWKENIELGK 
QATNNSFTEYMLNCAGLDPCHSMCGSRTK 
TTTITCELARNAESQAPPHTY 



185 
718" 



284 



GQARWLMSVIPALWKAEAGGPLEPRSSRP 



AWAT 



305 



SEQEPLLGDTPGSREWDILETEEHYKSRWR 
SIRILYLTMFLSSVGFSVVMMSIWPYLQKID 
PTADTSFLGWVIASYSLGQMVASPIFGLWS 
NYRPRKEPLIVSILISVAANCLYAYLHIPAS 
HNKYYMLVARGLLGIG 



502 



822 



DSKAAQDLEKLHGVNGMSVDEKPDSPVMY 
VreSTVHCTNILLGLNDQRKKDILCDVTLI 
VERKEFRAHRAVLAACSEYFWQALVGQT 
KNDLWSLPEEVQ*FGLCDC 



2704 A 



2705 C 



2706 A 



313 



638 



2707 A 



431 



244 



1606 



RWRQRWFWCLHCLVLFRITPRTFALSQCR 
PWDDSRSQDTSMSHSIQWNRMYCNCSMQ 
DEQEADEANGKGPAQVGDRQAWAGR/CR 
SHRREGTIPGNPHPRAS*RAGWQR 



838 



MLLHVGTTAHVAVEHLIGGVQDDEDLEM 
TIGCHGEEMIGDLDKNSFGAGGLCIGERVG 
GPGCCEVLIRMTPTEDVGEERSDMKGIQLS 
MQERTRCRQFPEGRRHQLGHLLQGGLGRG 
EAWKYHQIWEEGHWLLREQ 



375 



228 



RGMGRTYRGRHTDSRKSDR** GGRRQKTQ 

KPMSCITVORKHGTS 

GTSGVQQEISRLTNENLDLKELVEKLEKNE 

RKIJKKQOCIYMKKAQDLEAAQALAQSER 

KRHELNRQVTVQRKEKDFQGMLEYHKED 

EALLIRNLVTDIJCPQMLSGTVPCLPAYILY 

MCIRHA\DYTNDDLKVHSLLTSTINGIKKV 

LKKHNDDFEMTSFWLSNTCVRLLHCLKQY 

SGDEGFMTQNTAKQNVEHCLKNFDLTEYR 

QV\L\SDLSIQIYQQLIKJAEGVLQPMIVSAM 

LEN *SIQGLSGVKPTGSQKHSS SMADEDNS 

YRLEAHRQMNAFHTVMCDQGLDPE1ILQV 

FKQLFYMINAVTLNDLLLRKDVCSWSTGM 

QLRYNISQLEEWLRGRNLHQSGAVQTMEP 

L1QAAQLLQLKKKTQEDAEAICSLCTSLST 

QQIVKILNLYTPLNEFEERVTVAFIRTIQAQ 
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Table 8 



SEQ 

n> 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (X^Un known, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LQERNDPQQLLLDAKHMFPVLFPFNPSSLT 
MDSIHIPACLNLEFLNEV 


2708 


B 


1 


468 


MQGLVNYQISDCCSNQFKLEVCLLNAENK 
VVDNQAGTQGQLKVLGANLWWPYLMHE 
HPAYLYSWEVRLTAQKSLGPLTSTHSLWG 
SALCPSPRASGMVTAHTKALDPSQPVTFVT 
NVTYAADKGPLWEVAAPSSSQRASSGVTE 
LTRVTPVDLQIE 


2709 


A 


419 


2 


TSNPKNKVGLLDLELNRLTKALFMALVAH 

SIVMVTIQGFVGPWYRNLFRFLPLFSYIITIS 

IJR.VNLDMGKAWGWMMMKDEN1PGTVV 

RTSTIPEELGRWYLLTDKTGPLTQNEMTF 

KRLHLGTVSYGADTMYEIHTK 


2710 • 


A 


1 


570 


MSAACGQNYTIALMEMGSVFAFGENKMG 
QLGLGNLTDTIPSPAQIIYNGQPITKMAFGA 
EFSMLMDCKX3NLYSFGCHEYGQLGHNSDG 
KFIARARRTDGYGRLGHAEQDEMWHLVK 
LFDFPGHRVSQIYTGYTCSFAISEVGGLFFQ 
GATNTSRESTTYPKAVQDLCGWIIQSLACG 
KSSIIVATERAP 


2711 


A 


574 


737 


AWEGAHVFTTSPSSCHSWVRDYARVGLPP 
LPLPCPQRALLGLWEVWKGAYSPAI 


2712 


A 


175 


2 


MALRHLALLAGLLVGVASKSMENTDTDV 
PAPEVLTRSTAGVRGACASQRGALRCLLG 
P 


2713 


B 


85 


591 


MERGPVTCTQAQTVRGRTGHRRRFGPGA 

HGLREEPEFVTARAGESWLRCDVIHPVTG 

QPPPYVVEWFKFGVPrPIFIKFGYYPPHVDP 

EYAEQSCFQAPSFPSPSPAEELRVVSARHG 

LCQALDASWFCTGVQRQPWTQPPTGYHL 

AQRAGDLYPVGFPKETYFEKV 


2714 


A 


1196 


1459 


KQCQRRCLETEVWKLSKLQISTKASNRQD 
RSTFSAPPRKSQL3V1W*TSLLSYFQKLPQSP 
QPSATTALISQQPSTLNPQPWPGSCPGG 


2715 


B 


1 


888 


MRIRRWSLMFDSVWPMCAFYSWAKASRT 

FLKADGLPRRKQWVLVEALAGGGVLGVK 

QITIQVLJEVLLRRGKESETYTKMYRRLGP 

EROOtSKYAGVERIVDKRKNKKGKWEYLI 

RWKGYGSTEDTWEPEHHLLHCEEFIDEFN 

GLHMSKDKRIKSGKQSSTSKLLRDSRGPSV 

EKLSHRPSDPGKSKGTSHKRKRINPPLAKP 

KKGYSGKPSSGGDRATKTVSYRTTPSGLQI 

MPLKKSQNGMENGDAGSEKDERHFGNGS 

HQPGLDLNDHVGEQDMGECDVNHATLAE 

NGLGL 


2716 


A 


94 


3006 


RTRSLTRKAMAEHAPRRCCLGWDFSTQQV 
KWAVDAELNVFYEESVHFDRDLPEFGHV 
U)VHGVHVHKDGLTVTSPVLMWVQALDn 
LEKMKASGFEFSQVLALSGAGQQHGSIYW 
KAGAQQALTSLSPDLRLHQQLQDCFSISDC 
PVWMDSSTTAQCRQLEAAVGGAQALSCL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










TGSRAYEFNLVCDRKHLKDTTQSVFMAGL 

LVGTLMFGPLCDRIGRKATILAQLLLFTLIG 

LATAFVPSFELYMALRFA\GLLPSLDLASA 

MSPY*QNGWGPHGGRRPWSWPSATSPSGR 

WCLRDSPTVSATGGSFRSPALRLAYCSSVLL 

LGSARICTLAPDPWEDGRGDTTDPENGLG 

Q*AETLPGAHEPAGPREDRPLRECPGSVQT 

PPAPEGDPDYLLCLVCGQSGVLRPEPPSGG 

LRPGRLSDAAHLWSC *GACPLFQHLHD AE 

VWPQVEP/RWGPWSWVA*CVSSSSSSQQIC 

PWWSPCWLWWGKWPQLLPLPSPMCTLPS 

FSPPSSGRQAWGWWASSHGSGASSHHL*S 

CWESTTLPSPCSSTAASPSWPA/SLCTLLPE 

THGQGLKDTLQDLELGPHPRSPKSVPSEKE ' 

TEAKGRTSSPGVAFVSLGTSDTLFLWLQEP 

MPALEGHIFCNPVDSQHYMALLCFKNGSL 

MREKIRNESVSRSWSDFSKALQSTEMGNG 

GNLGFYFDVMEITPEnGRHRFNTENHKYF 

KGKGAPGHPMPSLKANFDLLACLRGVGSS 

TLLLWPAVLGAQTRQAGVNEGRSQVADF 

LRIPVTGCPEQRRNPPSPPAPLGTGGPAEER 

LQFPGVAGSRRGRGRILRAGGIGRASPGEG 

TGAPRPRAGQGRGGPGKPESGGGGPVALR 

PGDCTCCVLKSQPRQQRRGACSAMAFRVR 

LRVRQSVRPPRGVIVAALQRPETQGPAPSS 

ARPDCGPESRGGLALWRRLRGYASRDRVL 

CNRRCPHAARFP SKRTPSGSPHLHLMSS W 

AVP 


2717 


A 


1308 


369 


LRSNHGEDWSQFIGAAQRETTVSLLPMPH 

TWPVSLSTGSCM/TRGTPEPFINNPQLQVH 

FHR/EDDEHSDIAFHF*VYFGHWVIMNSHE 

aGAWKCEERSNNMPAEDGRVFELHIIVLD 

NE YQ AMVNG/QS LLHSFAHRLLPGS VKMV 

QVWRDVSLNSRCVSSGETVSSSSSFLPPPPP 

PLPLPLLLLLPPLPLPDEALFLSLPSHALPSG 

RCGVLSLCGSHYPQPGGLLQSSAGASGRR 

GAPGVPWQVLVLLTPRGLQGPPPGMRGRV 

VHKPLLVMELGEQPFSFPSVRTATSSASGK 

APPRCPWPGPRALSPSSVP 


2718 


A 


2 


1226 


SLGSTISTDWANHYLAKSGHKRLIRDLQQ 

D VTDG VLLAQ HQ W ANEKIEDINGCPKNR 

SQMIENIDACLNFLAAKG1N1QGLSAEEIKN 

GNLKAILGLFFSLSRYKQQQQQPQKQHLSS 

PLPPAVSQVAGAPSQCQAGTPQQAPGVPV 

TPQAPCQPHQPAPHQQSKAQAEMQSRLPG 

PTARVSAAGSEAKTRGGSTTANNRRSQSF 

NNYDKSKPVTSPPPPPSSHEKEPLASSASSH 

PGMSDNAPASLESGSSSTPTNCSTYSGIPHS 

GAATK^WRSKSI^VKHSAWSMLSVKPPG 

PEAPRPTPEAMKPAPNNQKSMLEKLKLFN 

SKGGSKAGEGPGSRDTSCERLETLPSFEESE 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /^possible nucleotide 
deletion,=po$sible nucleotide insertion) 










ELEAASRMLTTVGPASSSPKIALKGIAQRTF 
SRALTNKKSSLKGNEKGKE 


2719 


A 


103 


742 


NANTQRARRREGARLDNLWLEQVISVLPG 

L VTQGFRCHS GPMGRGLEPHPIRGAG AGS 

CQLSIRGRGGRIPAFLTPRRLAPKGGRDLG 

FPAPRGTRCLRHSFCRSIARTVT/RTVRGIR 

GEEARTPGSREMDSWFEDVDVNFTQEEW 

ALLDPSQKNLYRDVMQFTFRNLASVGKK 

WKDQOEDEYKNPRRNLRNYVYHFSLKK 

WSWSLYARQT 


2720 


A 


1258 


586 


LLLHSLFPVPRMGNSASNIVSPQEALPGRK 

EQTPVAAKHHVNGNRTVEPFPEGTQMAVF 

GMGCFWGAERKFWVLKGVYSTQVGFAG 

GYTSNPTYKEVCSEKTGHAEWRWYQPE 

HMSFEELLKVFWENHDPTQGMRQGNDHG 

TQYRSAIYPTSAKQMEAALSSKENYQKVL 

SEHGFGPITTDIREGQTFYYAEDYHQQYLS 

KNPNGYCGLGGTGVSCPyGIKK 


2721 


A 


2806 


382 


NEIEKQLNAIRDNIKIGEDRAARLDRKMEE 

QQVRLNEAEQKYKDIQDKLEKISEETNAR 

APECMAIJCADWAKKRAYNEAEVLYNRS 

DSTEYKALKKDDEQLCKRIEELKKSTDQSLE 

PERLERQKKISWLKERVKAFQNQENSVNQ 

EIEQFQQAIEKDKEEHGKIKREELDVKHAL 

SYNQGQLKELKDSKTDRLKRFGPNVPALL 

EAEDDAYRQGHFTYKPVGPLGACIHLRDPE 

LALABBSCLKGLLQAYCCHNHADERVLQA 

LMKRFYLPWTSRPPnVSECRNEIYDVRHR 

AAYHPDFPTVLTALEIDNAVAANSLIDMR 

GIETVLUKNNSVARAVMQSQKPPKNCRE 

AFTADGDQVFAGRYYSSENTRPKFLSRDV 

DSEISDLENEVENKTAQILNLQQHLSALEK 

DIKHNEEIJJCRCQLHYKELKMKIRKNISEI 

RELENIEEHQSVDIATLEDEAQENKSKMK 

MVEEHMEQQKENMEHLKSLKIEAENKYD 

AIKFKINQLSELADPLKDELNLADSEVDNQ 

KRGKRHYEEKQKEHLDTLNKKKRELDMK 

EKELEEKMSQARQICPERIEVEKSASILDKE 

INRLRQKIQAEHASHGDREEIMRQYQEARE 

TYLDLDSKVRTLKKFIKLLGEIMEHRFKTY 

QQFRRCLTLRCKLYFDNLLSQRAYCGKMN 

FDHKNETI^ISVQPGEGNKAAFNDMRALS 

GGERSFSTVCFILSLWSIAESPFRCLDEFDV 

YMDMVNRRIAMDLILKMADSQRFRQFILL 

TPQSMSSLPSSKLIRILRMSDPERGQTTLPF 

RPVTQEEDDDQR 


2722 


A 


1567 


1145 


AEVLGRAVEPPPGRCWSTPPVAPPARSASA 
AAMGVQVETISPGDGRTFPKRGQTCWHY 
TGMLEDGKKFDSSRDRNKPFKFMLGKQEV 
IRGWEEGVAQMSVGQRAKLTISPDYAYGA 
TGHPGUPPHATLVFDVELLKLE 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

deletion ^possible nucleotide insertion) 


2723 


A 


374 


656 


RRVGCRCFHPSQTGTCT*RPPWNVHH*PAT 
CHLAYNRHSWSPHRA/HWH1ATA1QLSAH 
W/ACHYQQLHHYHQHHHHHHHYRHHHH 
HHHHHYCHHH 


2724 


A 


1171 


1639 


PMALWADGRARHKVGTECECGMHPGLKC 

SGRTLGSQTMLATTPCDSPT*I/SNKKGLRS 

V/SYR*CLINALWLFSISPHILVRCGTESS*L 

LPSLVPSWLP*LVRVR\PLPTGWC*IPSCLKP 

\PPTWSSHHSPQRLP*NPATLVCLQNGTARS 

HSSTPV 


2725 


A 


8 


505 


GSFKTGLYLPTSDIDLWFGKWENLPLWTL 

EEALRKHKVADEDSVKVLDKATVPIIKLTD 

SFTEVKVDISFNVQNGVRAADLIKDFTKKY 

PVLPYLVLVLKQFLLQRDLNEVFTGGIGSY 

SLFLMAVSFLQLHPREDAC1PNTNYGVLL1 

EFFELYGRHFNYLKTG 


2726 


A 


214 


32 


MTLRMLVPRLLLTRQLVWFFSAATERDPE 
MMNGIPRKJLMSFPPSSVTSRRSRRGHHLQS 
L* 


2727 


A 


2 


40 


WNSDQPATR*QVGDTGSLPSRKGQHFVLT 

GIDTYSRSGFAFPVRHAPAKTSERGLTECRT 

YCHGMPHCTAS V*GTPFTAKKVW *RAH A 

HGIPRYDHVAHHLEAAGLIRWWNGLLKTP 

LQHQLGGDALQGWARVLQEAVYALNQN* 

V*GW 


2728 


A 


16 


444 


TPSPSPCPXPRPLAALKPVRLHSFQEHVFKR 
ASPCELCHQLIVGNSKQGLRCKMCKVSVH 
LWCSEEISHQQCPGKTSTSFRRNFSSPLLVH 
EPPPVCATSKESPPTGDSGKVDPVYETLRY 
GTSLALMNRSSFSSTSESPTRS 


2729 


A 


37 


655 


AEPAAGAGTLAGDCRAVQGGVHAARPRG 

AKEGHGPADGHGKGGAGTGQERLAGGAE 

VCHAQVRGGAAAPGCRVGGVLRAAKAE* 

GAGRARGRAG1AGGHPAGGHPHQPGQGA 

G*AEDQGQRAPGRGEAAGSGR/GA/GPGA 

GAAGAAAGEGEDQRHRPACQAPRRGGGE 

HEQGGLREVRGGGAGIARGPAGAGRAAG 

PVAGGAATAGAA 


2730 


C 


257 


498 


MQfCSEGSGGTQLKNRATGNYDQRTSSSTQ 

LKHRNAVQGSKSSLSTSSPESARKLHPRPS 

DKLNPXTINPVHSDDEVFERG 


2731 


A 


342 


665 


MALDFVNVLLCQLAEVTLGVLREEGASLL 
VALGSALFPSAAAVGKQGSMGVTSHMQC 
PVCQHPRDVLLASPVSHSHACXJPQPAGCS 
N Orli-fVjrlL 1 Korrr vvxL..L«rl^lA<( 


2732 


A 


1 


825 


MKRYSYGSVLFTAFDLGYLDPDEVQQGHE 

IGRLFDGTEPIVLDSLKQHYFIDRDGQMFR 

YELNFLRTSKLLIPDDFKRTLVF1LPLAAPFS 

VGLEACPLAGKRLKGSVCPELEFPLWKKH 

RVFSQSLPYKTHAFNEERLQDNKSYIHSVL 

OEPREDTDPEGAGAAPDHRSTYKLLSPALS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X-Unknown, *-Stop 
codon, /=possib!e nucleotide 
deletion,=possible nucleotide insertion) 










LNIX3EKNKWLRRYIELLISEREMAAAGSSI 

mtm rrrT TIT/OAT /*\Y T k WT?C\ 7 A ' 1 "1 \ /T 

PSWTSVSIQVKLRKCQLQLLAKEEVA1 1 VL 
DETSGVNGIHIEHQLQCLIQVPKLSAPNIAP 

PTPA 


2733 


A 


135 


438 


GMGYLPIAKGIIJIKDLKSKNVFYDNGKVV 
ITDFGLFSISGVLQAGRREDKLRIQNGWLC 
HLAPEIIRQLSPDTEEDKLPFSKHSDVFALG 
TIWYELHAREWP 


2734 


A 


74 


661 


HTHKLVAPRPGLPPTSQWPRDAGRQASGG 

LPSLSTGPPKGPRDGLARGHPAEWLAGSPG 

NNSPTQGSLPPQLDLYAGALFVHICLGWNF 

YLSTILTLGITALYTIAGMVPAAGRSTQGT 

CKGVRRPPPPTGPREQPRKWPQQEPQKFLP 

VSLIJGARAPSSNLASTGRGPGCCNLHGRP 

ADAHHGGGGCHPDNQR 


2735 


A 


40 


446 


RHLLLSLSAVTGKCSFAPDCGELKLPGAAC 

ACQWADVSSLLL*LCQMRELRCENVATC 

LGIFVGSLGNLLRKEVLHLDWTFKASLLLD 

LICMRSLPGPGTAELLWTAPELLPGPGRPG 

RRTLTGDBFSTGIILQE 


2736 


A 


1 


517 


LVDPRVRGEPGPPSDAVFARDPMRPPGLV 

RNLQVTDRSNTSITLSWAGPDTQEGDEAQ 

GYVVELCSSNSUJWLPCHVGTWVTTYTA 

KGLRPGEGYFVRVTAVNEGGQSQPSALDT 

LVQAMPVTVCPKFLVDSSTKDLLTVKVGD 

TVRVPVSFEHARRPLGPSTCRRTCLGR 


2737 


A 


3 


437 


NDPRVQKPREEAPAGAAASG*CGR*PGQH 

PAAA*\P*SAGPRRAPTALSPPTAEPSLCPA\ 

PG*PEQPQCSRRPGGQPRDPVGQHRSQPAV 

GPAAGSPLRPCAWSAQRGSPQPDQLPHTPP 

GAAGS*SQLPRPPPSFAQATPSTPP 


2738 


A 


34 


576 


EELCVREHVTGGICGGSQMMVVLLGATTL 

VLVAVAPWVLSAAAGERRGGESWRRAGG 

RARSWATGAAMLLGATDAQSGKPSVHFA 

APFCIKPDLGSQINQEKWFWVLSCRLPVAV 

YGSSGAPGSHPREMAVPELCVEFDSFRETH 

QILLVYFVCGPRQLFFQCGPRKPKRVDTLD 

ADEACR 


2739 


A 


2 


410 


CHSTESSSDFILPGDYLLGGLCPLHSGCLQV 

\CSFNEHGYHLFQAMRLAVEEINNSTALLP 

NITLGYQLYDVCSDSANVYATLRVLSLPG 

QHHIELQGDLLHYSPTVLAVIGPDSTNRAA 

TTAALLSPFLVPMLLEQ 


2740 


A 


2 


417 


STRPEFPGRAPTGFLKLLADKNSELFRKYA 
T F ^P^nHR VPRTY VPLKD CPODF V ARPKD Y 
ANTLFICRIVDWKEDCNFALGQLAKSLGQ 
AGEIEPETEGILTEYGVDFSDFSSEVLECLP 
QGLPWT1PPEEFSKRRVV 


2741 


A 


1 


312 


MAPAADREGYWGPTTSTLDWCEENYSVT 
WYIAEF\SWLMSGFLPTPSSLRDLTASRWV 
RSLPPSRSPAGROPGPAEELPKASPCPWGK 
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Table 8 



ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










SLSRPFASFSASSGPS 


2742 


A 


2 


374 


FRDLQCALYNGRPVLGTQKTYQWVPFHG 

APNQCDLNCLAEGHAFYHSFGRVLDGTAC 

SPGAQGVCVAGRCLSAGCDGLLGSGALED 

RCGRCGGANDSCLFVQRVFRDAGAFAGY 

WNVTLIPEGA 


2743 


B 


218 


656 


MGPVPLVWAMSQLSLSAKMDRRRTGVM 
MTSTPITWGTLEKTMQEAEKLLERQGQTK 
TPDSMFIAMEESniSnrFVKNITTQFMVCG 
FNPYWIAAKADQLQVWSHTTTASQER 


2744 


A 


85 


396 


MILINFREICLKVLHTPLCVSGGCVLLYILA 
LTCCYTNSLLISHLPPLSIJPTETQTHLFMYR 
VLKVRKDIKNHVFHPTYLVAKETETYGEE 
LEPLPPCREHQD* 


2745 


A 


1 


3899 


NRPSSASSTSSKAPPSSRRNVGMGTTRRLG 

S STLGSKSS AAKEGAGAVDEEDFIKAFDD V 

PWQIYSSRDLEESINKIREILSDDKHDWEQ 

RVNALKKIRSLLLAGAAEYDNFFQHLRLL 

DGAFKLSAKDLRSQWREA\CITLGHLSSV 

LGNKFDHGAEAIMPT1FNL1PNSVAKIMATS 

GWAVRLIIRHTHIPRIJPVITSNCTSKAVA 

VRRRCFEFLDLLLQEWQTHSLERHISVLAE 

TIKKGIHDADSEARIEARKCYWGFHSHFSR 

EAEHLYHTLESSYQKALQSHLKNSDSIVSL 

PQSDRSSSSSQESLNRPLSAKRSPTGSTTSR 

ASTVSTKSVSTTGSLQRSRSDIDVNAAASA 

KSKVSSSSGTTPFSSAAALPPGSYASLDGTT 

TKAEGRIRTRRQSSG SATNVASTPDNRGRS 

RAKVVSQSQRSRSANPAGAGSRSSSPGKLL 

GSGYGGLTGGSSRGPPVTP S SEKRSKIPRSQ 

GCSRETSPNRIGLARSSRIPRPSMSQGCSRD 

TSRESSRDTSPARGFPPLDRFGLGQPGR1PG 

SVNAMRVLSTSTDLEAAVADALKKPVRRR 

YEPYGMYSDDDANSDASSVCSERSYGSRN 

GGIPHYLRQTEDVAEVLNHCASSNWSERK 

EGLLGLQNLLKSQRTLSRVELKRLCEIFTR 

MFADPHSBCRVFSMFLETLVDFmHKDDLQ 

DWLFVLLTQVLLKKNGEADLLGSVQAKVQ 

KALDVTRDSFPFDQQFNILMRFIVDQTQTP 

NLKVKVAILKYIESLARQMDPTDFVNSSET 

RJLAVSRI1TWTTEPKSSDVRKAAQIVXISLF 

ELNTPEFTMLLGALPKTFQDGATKLLHNH 

LKNS SNTS VGSP SNTIGRTP SRHTS SRTSPL 

TSPTNCSHGGLSPSRLWGWSADGLAKHPP 

PFSQPNSIPTAPSHKALRRSYSPSMLDYDTE 

NLNSEEnfSSLRGVTEAIEKFSFRSQEDLNE 

PDCRDGKKECDIVSRJDGGAASPATEGRGGS 

EVEGGRTALDNKTSLLNTQPPRAFPGPRAR 

DYNPYPYSDAINTYDK.TALKEAVFDDDME 

QLRD VPIDHSD L V AD LLKELSNHNER VEE R 

KGALLELLKTTREDSLGVWEEHFKTILLLL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /-possible nucleotide 
deletion,=possibIe nucleotide insertion) 










LETLGDKDHSIRALALRVLREILRNQPARF 

KNYAELTIMKTLEAHKDSHKEVVRAAEEA 

ASTLASSIHPEQCIKVLCPUQTADYPINLAA 

KMQTKVVERIAKESLLQLLVDIIPGLLQGY 

DNTESSVRKAS VFCLV AIY SV1GEDLKPHL 

AQLTGSKMKLLNLYKRAQTTNSNSSSSSD 

VSTHS 


2746 


A 


153 


1224 


RVFSESVCSPVRNLEFLWRFAFPLAPAGRC 

PPGVPLQTSPRDTDAHRSSPLPPARASPGQ 

VAAAYRWARCPGCGGRKPRSSGSWQLCR 

CPTLPPPPRGSRSSGRC/RTWPSPPSCFPHFQ 

SGPRTTRAPTPSTT\PGYSGSYSSGPGR*GLS 

PLHAA/VSPPLPPGGP*GSWARAGLGSIASA 

HSPCPLCRSLIRSRS*QTCTRSPT*NCEVPPS 

AP*AASPLRTMFALVRTAGLKVHLLPLGY 

CITMS*SSSMPQTVPVVVKVSNIPSVHPP*P 

CCKDCTISRSRSIFTRSPICNPPGFLLPFCSPS 

TGQ*SL*KEPPLASWTHFRSDVLLLFSVSM 

NGSTLSLGCPSQKAVTALVQVT 


2747 


A 


1 


996 


MKIHSCAFVIEQEEKKKTEAHKEGDGVKR 

ADKILGVTKDPGTIAGLNVVRIINEPTAAS1 

AYGTDKKFGAERHVLIYDLRDEIFDVSVLT 

LEDEIFEIKSTAGDTHLGEEDFDNQMINHFI 

AEFKYKHKDSRADIYTSrTHAQEEELNAVL 

FRGTQDPIEIALQDTKLDKLQnrVWLTQTF 

TTYPDNQPDVUQVYEGESAITKDNNLLVI 

QGKFELTGILPAPFAVPQIKVTCDIDVNSSL 

NISAVGKSTEKENKIHTNDQGHLSKEDIEN 

MVQEAEYKAEDEKQKNKVASKNSLDSYA 

FNMKATEKLQGKINNKDKOKILDKasraiN 


2748 


A 


73 


1210 


IPPPSSPSSPAAAPRAQLGKDALSPLALLLR 

PRRAYPRPLPTSESLAWGSPPPSRFGPSPAS 

QPRSPRLSFLVLGVACSA1LMYIFCTDCWLI 

AVLYFTWLWDWmTKXGGRRSQWVRN 

WAVWRYFRDYFPIQLVKTHNLLTTRNYIF 

GYHPHGIMGLGAFCNFSTEATEVSKKFPGI 

RPYLATLAGNFRMPVLREYLMSGGICPVS 

RDTIDYLLSKNGSGNAniWGGAAESLSSM 

PGKNAVTLRNRKGFVKLALRHGADLVPIY 

SFGENEVYKQVIFEEGSWGRWVQKKFQK 

YIGFAPCIFHGRGLFS SDT WGLVT YSKPITT 

WGEPITTPKLEHPTQQDIDLYHTMYMEAL 

VKLFDKHKTKFGLPETEVLEVN 


2749 


A 


351 


205 


DLYSEKASADHEGAEQFTDEFAKVIADGN 
LMrEi^V YIMAVKlMJr WL/ivivr j 


2750 


A 


172 


2 


MLEQASLWLGRSFLLAGFLVSSSCPSLEQA 
AKGEGCSPIPCFAHCLDSLVRNFLCHP 


2751 


A 


2 


1410 


GPLIDLCKGPHETHTGKKTIQU'TNSSTYW 
EGNPEMETLQRJYGISFPDNKJVDvIRDWEKF 
QEEAKNRDHRKIGKEQELFFFHDLSPGSCF 
FLPRGAFIYNTLTDFIREEYHKRDFTEVLSP 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
Location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codou, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










NMYNSKLWEASGHWQHYSENMFITEIEK 

DTFAIiCPMNCPGHCLMFAHRPRSWREMPI 

RFADFG VLHRNELSGTLS GLTR VRRFQQD 

DAHIFCTVEQIEEEIKGCLQFLQSVYSTFGF 

SFQLNLSTRPENFLGEIEMWNEAEKQLQNS 

LMDFGEPWKMNPGDGAFY GPKIDIKIKDA1 

GRYHQCATIQLDFQLPIRFNLTYVSKDGDD 

KKRPVIIHRAILGSVERMIAILSENYGGKWP 

FWLSPRQVMVIPVGPTCEKYALQVSSEFFE 

EGFMADVDLDHSCTLNKKIRNAQLAQYNF 

ILWGEKEKIDNAVNVRTRDNKIHGEILVT 

SATOKLKNLRKTRTLNAEEAF 


2752 


A 


319 


495 


MVASFRESRVLLLGLWRVLTFDFLTQW 
RVGSECGDELVRLYSFTDEKANYLQQGGC 
R 


2753 


A 


23 


1255 


LRSIYTTHYRESVPKA/HLTDSFPDLLGLAA 

ED*HCPIALEAL*TITDAELRVTLTVEGKPV 

PFUNTEATOSTLPSFQGPVSLASITWGIDG 

\QA\SKPLKTPQ\LWCQH*TIRRFKHSFLVIP\ 

TCQVPLLGVEDTLTKLSASLTIPGLQLYLIAT 

LLPNPKPPLCPPLV/SPQLNPQV*DISTPSLT 

TDS 


2754 


A 


277 


467 


GLGPHDYLYSILSIERSCCC*CCCCCCRRRR 

CCCCOCV*GCSRFLCSIAESTPSGALRRLR 

GGR 


2755 


A 


86 


593 


ASALLFWGFAESLREFTADCPPYKCPVAP 

EPLPQPLS\PLQCPGEESTDSPFSLPTVQPVK 

SRCSPFIEESPRANRSIPAFGSHLECASCSSR 

SFHGPPPCCLWGLPLSAPSPHVLHPPASAAI 

GPACCVTSLCPGAPQAQRPRKVDQTSSAP 

GAGPGTQDGNERPNP 


2756 


A 


3 


3617 


YWKERPTQKVTPRATENHGLKSYLQKTKL 

SEDEAAFLLPDTNLKSELLELLTHWLQVGV 

PNTTPSLGSINLLGWLTELRETHTYICWFIV 

KETTRDTDEEMCRTEPALACSISHYCDDGC 

IQMLNTPETLQCSAKDSKHFIPBCECSIPGEN 

RPPSDTGKTVKFLSLNIFNLQLAESTDAEQ 

RANCILRCFLTETTLOTQKILSVRPGTK1AT 

ASHVSGLGLQTPPFGLAQHLIRPHAFLAPK 

DPLTSFTERNSRSGKTRCRSKKCAMRVVK 

SYSAILPKKRESVLTKTLLVAPTNEQTDPV 

LRMCCGKTGLKKGAGFTLESRGQRRMRA 

GCPTLCVRARXrTETDPSICSEVTFSWMILM 

LMDVCQCLGIEEFGTYCSLRSLDLFVPIFLE 

T/\rrA\ircr»TCOBT\yTT YX7UT /'YTXTO Cl' I * I ' I \J A T T\ 

KVFQVFEGTbor UYLLW rlA{ 1 liJtCU 1 lLv /\l>lj 

KIQKNSLDYQAETLVLFPYFLPNKWNLSVF 

AEPPGTGD WMQAPLWPPP LGLYWALEH 

YDQHVAKPARQRSLSLWPPPPTAHKGFLQ 

GHCXJCSLKTQRLFSQLMANAARPETQASG 

QWTPFSPGQIQKCSPRSRN ALGTP RACLLL 

YPTVAELGSTEFNVKPSICCTLPYOGAQSPS 
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Table 8 




SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LHTLQLRGNGVGGQHQQFKTVSLDPFNAS 

FRDMKLKLGKSGISSWFVSIAAAVGDEGL 

VPRSMELYSQKAYDCLCCVMQWRKVGE 

SWQSQTSPSSHTTQKANLTSTLPPTTALSV 

FPGSGYQEWGTAVKILESMEATLEQDNKT 

RLEQFGGFRRKEDRKMWESLELPRDLWN 

DFDQNADSDMDNEVQAEWSDGDKELVR 

NWSKVWKGNVGLEPRYRVPTGALTSRVV 

RRGPPSFRPQKCRSTDSLHHEPGKAAGTQC 

QPVKDLPBCAVGAHSLHQPALDFRQEYLNP 

FSKNAKFQYECGNYSGAAENFYFFKGLVP 

ATDRNALS SLWGKLASEILMQNWD AAME 

DLTRLKETEDNNDEKPSFTHVVGKERYLN 

AIQTMCPQFFRY/L*LTAVHNKQGIVRKRR 

PRV*KI*LSFIKQE\SYTYKRPNLQNLLECL\ 

YVNFDFD GGSRKS *GECEPGLV\NDFFLGG 

*S*GFQ*K3VlPRLFIFEmCRIPPSVSAIN\ML 

AD\KLNMTPEEAERVD W *NLIRK WQ AWM 

PQDLIPKLGSCGLWGNNAV\SPLQQVIEKT\ 

KSLSFRSPDVGP*IMRKNLNQNSRSE\AP *R 

GOLODSGLLOCNHKEKMKKKNYQRKMK 


2757 


A 


1 


3090 


MHKELPALAACGLVADFDPVGEEETADFG 

PLVLDSDSDDSVDRDIEEAIQEYLKVGSSK 

DQGSASPVSMSRADSFEQSIRAEIEQFLNEK 

RQHETQKCDGSVEKKPDTHENSAKSLSKS 

HQEPATKWHRQGLMGVQKEFAFCRPP\R 

LAKTNVQPRSUISKVTTTTTQEKEGSTKPA 

TP/TRPSEAVQNKSGIKRSASTARRGKRVTS 

AVQAPEASDSSSDDGIEEAIQLYQVQKTHK 

EADGDPPQRVQLQEERAPAPPAHSTSSATK 

SAIJETHRKTPSKKKPVPTKTTDPGPGDLD 

ADHSPKIPK^TKAPPPTSPASRSKFVEWSSC 

QADTSAELIXAVLDEFKTILP/APMEGSDGSL 

SASPLFYSPNVPSRSDGDSSSVDSDDSIEQEI 

WTFLALKGTASEAPGGEGAARVPGDTRTS 

QGQGKTDEARHLDKKKSSEDKSSSLDSDK 

DLDTAIKDLL/RRVPGPSSQPWLLV*QQQFS 

GQRR*HRTGD*EVFGGKGQGVGSPRPGPA 

LSLEAHTCWRRRTAITGQAGRCVLCYDSQD 

PKCGDLKXPSKK^VKRKPYSTTKVTSGSTF 

h^OTRRYAVHTNQCRRPHGSRVKKKRYP 

QEDDFHHTVFSNLERLDKLQPTLEASEESL 

VHKDRGDGERPVNVRWQVAPLRLESSKY 

TGITCQENNIJ3AKKAPHEDTVHDITNEDA 

THDIANEDTVHDIANEAADKGIANEDAAH 

GIASEDAAHGIASEDAAQGIASEDAAQGIA 

SEDAAQGIAKEDAAQGIANEDAAQGIANE 

GAAQGIAKEDAAQGIAKEGAAHGIANEDA 

AQGIANEDSAHGIASEDAAHGIAIANEDAI 

YDIANDTVQGTLTRTLYTTSLMRTPYKAL 

VMRTLYMTSLTRTLYKPSLTRTLYTTSLM 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *-Stop 
codon, /=possible nucleotide 
de!etion,=possible nucleotide insertion) 










TAPYKTSPMRALYTTIIMIPTRHANADTV 

HDIANEDSVYDIANEGAVYDIANDTVQGT 

LTRTLYTTSLMRTPYKASVMRTLYTTSLTR 

TPYKPSLTRTLYTTSLMTAPYKTSPMRALY 

TTLLMIPTRHANVDAVHDIANEDTV 


2758 


A ! 


1 


1026 


MTLGPLTNQRKEHLTNFKSVSTP SSESFEC 

FFSTDSSDLSPSPQAARRQAEPGACFKCWK 

SGHWAEECLQPRIPPKLHPICVGPHWKSDC 

PAHLAATPRAAGTLAQGSLTPSQIFLAEWL 

KTDTARSPQKPPGPSQTLWVTLTVEVAAT 

ALILLEALKTTSYAPLTLYSSHNFQNLFSSS 

HLTH1LS APKTLQL YSLF V ESSTTnVAGPDF 

NPASHIIPDTTPDPHDCISLIHLTFIPFPHISFF 

PVPHPDHTWFIDGSSTRPNRHTPAKAGYAI 

VSSTFUEATALPPSTTSQQAKLIALTQALTL 

AKGLLVNIYTDSKYAFffiQYHHAVIWAER 

NFLTT 


2759 


A 


1 


383 


TRKCGQLPRSVSLPSGPQPLPGSVRHPKPV 

LRRPLPRAQGSSSSFRPRPPFAPDTMDKFW 

WHAAWGLCLVPLSLAQIGECPPQPGQQDG 

CGVLSADPAAAPPAESALGDWSQVSCLRS 

ALGSGKQGW 


2760 


A 


1057 


1226 


ARP SRVEAQMLG ARRAAS WLW AP WFCPN 

EG*NQPGQHSETPSLQKVLKPGMVV/HLL 

WSOLLGSLRWEDRLSPGD 


2761 


A 


349 


1 


NQTPFFFFFFGGTETTSTTLCSXYGLLILLKY 
PEV A/ES ASQRDPEWEAAVWRWLEGPGS A 
QPPSAPAKGQELDPWGQRPVPSPDDHVQ 
WPYTNAVLLEIORFISWKRTLTLDTLY 


2762 


C 


199 


531 


MTGIVAKQNSASVPLPARLVRPTVNRKLL 
GAGTGSLPRKEARRERFLDGDQDGDEGPR 
QPSMGLPHKQVQNRAMAKW1TFAPTNA 
MOLARSPKTLNFMKIIGEMESVLE 


2763 


A 


1 


1428 


MVNPTVFFDTEPLGRISFELFADKFPKTAG 

NFHALSTGEKGFGYKGSCFHRTVPGFMCQ 

GGDFTCHDGTGGKSIYREKFDDKNFIRKHT 

VSGILSMANAGPNANSSQFFICAAKTEWLD 

GKHVVFSKVKEGMNIVETMECFGSRNGKT 

KGAGLAGSHSQRWLAA S VCGASQPSRLLS 

TACRQQKLQISGRSKGCSRKTSGLEDQGLT 

KDGTNNTQGIKLQLGEEEEHSPRPSSLVPV 

SQLKANGSSSASIACAEDGPARPVPGCQCQ 

NQGHHQNKRPRTSQLCQMPKTHLVVADA 

RPNISRVFFGLPERESALWSFPRDWLVNLL 

NnrnFT gtrnofevevlsyghlplaysarc 

FTARSEDRPKDECETCCIKYPNGRNVLSQE 

NQQVFVLNGIQTMSGYVYNLGNELASMQ 

GLVDVVRLSPQGTDTFAMLDAFRANENG 

AAPLPLTANSDCNGYWRRLADFECTWAH 

SQGGCHA 


2764 


B 


1 159 


2657 


MTCGTDGAITFWESLTGHRY1HKPTNPDEP 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^Unknown, *=Stop 
codon, /^possible nucleotide 
deletion 5 =possible nucleotide insertion) 










PVAEQPKPLYPYRTIGCVFNHQMFLGNCQ 

PSDAVETCVFDLNDESKWKPMSEEAIKSV 

CAPGATTSLPPFPPLCASTEDASVTSNEIEM 

QLRLLVSEHRKYTKIHTCPSPTGGPVEPAD 

TKSQPSVCMDFTSHEYRISDPFLVEKNLPK 

EKTANTAGHQKEQTGDTLPLRNITGTVRV 

HGFILEVSETKNPPNPGHKTTSISQRPKALV 

SLGPEVRRGTRGEDEKALEKEGGGRRWEC 

GGANELCGRPPAFTRVTVHWGKGNDQTF 

QDLLDTGSELTLEPGDPKRHCGPPVKIGAY 

GGQnNGVLAQVQLTVDAVGPWTHPWFP 

SARMHNWNRHTQQLAESHIGSLTVHLSSD 

PKGCHSEWGPEQEKALQEVQAAVQAALEL 

EPYDPAGPWLEVSLADRDAVWSLWQAPI 

GESQQRPLGFWSKALPSSAAIKRVMHSSIP 

SSNGSGIYMIGLEQVRKAQIVLHDMQPPCE 

NGTASALQPLSRKSLKDSSEGKSSQWAEL 

RAVHLAVHVAWKEKWPDVRLDTDSWAV 

ANGIARWSGTWKEHDRKIGDKEVWGRGT 

RIELSEWSKTVTIFVSHCFYQDYHPSVGSQ 

NALYTNMWHTALPLTKALTLRLKNCNSG 

LMLTEFTGLTMFPnQGWGKVLQKAVYAL 

NQRPIYEWKEESCLHTGVADALRGNWAE 

GHREHKALWLGLWSTWSQHPLRSLKTTR 

HHPGLGVLSEDICEAGGATEELSRASGFAT 

GYGKRKEDTKKHKQHSVSDIM 


2765 


A 


3 


662 


TRIAETILKKKTKVGGTILSDFKMNKARVL 

EIVWYLWSNRCMNQWNRIEDPETDPQTN 

GALAIGHPQTKQIKLTNRPQSLNLNLRPDM 

KMNSKWIVD LNVKCEAIKTF/EKKTRENLH 

HQKHNLEDNTYKLNFKICSAKSAV/SRDCK 

K/PTA*EKIFANRLSNIGLISREYKQLLKLSS 

*KTV*LENGGLAWWLTPVIPSLREAKVDEP 

LEARGSRPAYPTW 


2766 


A 


736 


927 


SVAHSSCVSHTHMHTLLGRRATINCLFRN 

GRGQVQWLTSAVPALRKADVGG*LEPRSS 

RPAWAT 


2767 


A 


194 


3 


MVmTIAIRLMQFEFRQFFlKVNFRMRGL 
SKMAMLLLCRARPYSYKKEEGWSVLSGY 
FLTAGNF 


2768 


A 


593 


230 


DFYLYPERKKRGQMMTAVSLTTRPQESVA 

FEDVAVYFITKEWAMG\PAERALYRDVM 

LENYGGCGPL*CHPTSKPALVFS\LEQGKES 

CFSPATGSSLSRNDWRAGWIGYLELRRYT 

YLS 


2769 


A 


3 


4804 


KRI^QKTLEVAFSEAVWMQPSVVLLDD 

IJ)LIAGLPAWEHEHSPDAVQSQRLAHALN 

DMDCEFISMGSLVALIATSQSQQSLHPLLVS 

AQGVHIFQCVQHIQPPNQEQRCEILCNVIK 

NKIX)CDINKFTDLDLQHVAKETGGFVARD 

FTVLVDRAIHSRLSRQSISTREKJLVLTTLDF 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










QKALRGFLPASLRSVNLHKPRDLGWDKIG 

GLHEVRQTLMDTIQLPAKYPELFANLPIRQ 

RTGIIXYGPPGTGKTLLAGVIARESRMNFIS 

VKGPELLSKYIGASEQAVRDIFIRAQAAKP 

CILFFDEFESIAPRRGHDNTGVTDRWNQL 

LTQLDGVEGLQGVYVLAATSRPDLIDPALL 

RPGRLDKCVYCPPPDQDGSSSSDSDLSLSS 

MVFLNHSSGSDDSAGDGECGLDQSLVSLE 

MSEILPDESKFNMYRLYFGSSYESELGNGT 

SSDLEDESMNQPGPIKTRLAISQSHLMTAL 

GHTRPSISEDDWKNFAELYESFQNPKRRKN 

QSGTMFRPGQKFFDEITELTYLPSFHHKAA 

PHQAEPGPNSSSASAPPPYNPFITSSPHTQS 

GLQFRSVTSPPPSAQQFPLKEVAGAKGIVK 

TALETAPTLALPVSSQPFSLHTAEVQGCAV 

GILTQGPGPCPVAFLSKQLDLTVLGSPSCL 

HAVASAALILLEALKITNYAQLTLYSSHNF 

QNLFSFSHLTHILSAPRLLQLYSLFVESPTU 

IIi>GPDFNLASHIILDTTPDPDDCMSLIYLTF 

TPFPfflSFFSVPHVDHIWFTDGSSTRPDRHS 

PAKAGYAIESSTSIIEATALPPSTTSQQAELI 

ALTRAFTL AKGLH VNIYTD SKY AFHILHIIH 

AVWAERGFLTTQGSSIINASLIKTLLKAAL 

LPKEAGVTHCKGHQKASDPITLGNAYADK 

DRTIDGSSQVIEEKNHNGYSVTDTGTLVEA 

ELEKLPNNWSPQTCELFALSQALKYLQNQ 

KTISILIQKEPSPALGLTPERKGNVGHAGKG 

PLESSSPDPFLCGQERREKGCRTATSVSITN 

PINRGPWWTHPGKELTPEHKGNVGHAGR 

DILAKAGAIIHLNIGEGTPVCCPLLEEGINPE 

VWATEGQYGRAKNARPVQVKLKDSTSFP 

YQRQYPLRPKAQQGLQKIVKDLKAQGLV 

BCPCSNPCSTPILGVQKPNRQWRXTLCHQAT 

QALFNFLATCGYMVSKPKAQLCSQQ/RYL 

GLKLSKGTRALSEEHIQPILAYPHPKTLKQL 

RGFLGVIGFCRKWIPRYGEIARSLNTLIKET 

QKANTHLVRWTTEVEVAFQALTQAPVLSL 

PTGQDFSSYVTEKTGIALGVLTQIRGMSLQ 

PVAYLTKEIDWAKWAVAVLVSEAVKIIQ 

GRDLTVWTSHDVNGELTAKGDLWLSDNC 

LLKCQALLLEGPVLRLCTCATLNPATFLPD 

NEEKIKHNCQQVISQTYATRGDLLEVPLTD 

PDLNLYTDGSSFVEKGLRKVGYAWSDNG 

ILESNPLTPGTSAQLAELIALTWALELGEEK 

u a xrrvTn cnrv A VT VT W AHAAIWKEREFLT 

SERTPIKHQEAIRKLLLAVQKPKEVAVLHC 

RGHQKGKEREJEENCQADIEAKRAARQDP 

PLEMLDCQPLV 


mo 


A 


1 


2919 


MLLATALRGFLKNGDRGHVDTEEWRSYP 
WAASFGQLRSSQNCPGASASGRTGVPTVL 
VARTD AD ASDLITSDCDP YDSEFMTGERTS 
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Table 8 




SEQ 

ID 

NO: 


Met no a 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










EGFFRTHAGIEQAISRGLAYAPYADLVWCE 

TSTPDLELARRFAQAIHAKYPGKLLAYNCS 

PS^NWQKNLDDKTIASFQQQLSDMGYKPQ 

FITLAGfflSMWFNMFDLANAYAQGEGMK 

HYVEKVQQPEFAAAKDGYTFVSHQQEVG 

TGYFDKVTTnQGGTPDKAFTPHPASKPAH 

KPGEQPMBCNNPLISIYMPTWNRQQLAIRAI 

KSVLRQDYSNWEMIIVDDCSTSWEQLQQY 

VTALNDPRITYIHNDINSGACAVimQAIML 

AQGEYITGIDDDDEWTPNRLSVFLAHKQQ 

LVTHAFLYANDYVCQGEVYSQPASLPLYP 

KSPYSRRLFYKROTGNQVFTWAWRFKECL 

FDTEUCAAQDYDIFLRMVVEYGEPWKVEE 

ATQILAINHGEMPIHSSREHFRVLPFCRSTR 

PFRQARKISRVIVTSTKSDSLYTVGMLALS 

VRAIRCPLYLLTGL1SVSKNGLWYCELQVA 

OIGRSXnrLYEKAFPLSEQCSKKAHDQFLA 

DIASE^SNTTPLIVSDAGFKVPWYKSVEK 

LGWYWLSRRMQIEETFRDLKSPAYGLGLR 

HSRTSSSERFDIMLL1ALMLQLTCWLAGVH 

AQKQGLDLGVYGAPETFLIDGNGIIRYRHA 

GDLNPRVWEEEIKPLWEKYTLATIDVLQF 

KDEAQEQQFRQLTEELRCPKCQNNSIADSN 

SMIATDLRQKVYELMQEGKSKKEIVDYMV 

ARYGNFVTYDPPLTPLTVLLWVLPVVAIGI 

GGWVIYARSRRRVRWPEAFPEQSVPEGK 

RAGYVVYLPGIWALIVAGVSYYQTGNYQ 

QVKIWQQATAQAPALLDRALDPKADPLNE 

EEMSRLALGMRTQLQKNPGDIEGWIMLGR 

VGMALGNASIATDCY ATG YRLDRTTVML 

DGDR 


2771 


B 


1 


1773 


MALGISAPVALQGTAPLLAVLSGCSFPKH 

MLQTVNGSPFWGLENGGPLLRARLGSAPV 

ETLELFSSLNK1LHSYH SS VVKCDLILLGRW 

TKAWDPLSAGGGCHTGPLPLQVEGNHPTG 

SYRVPNRPQYRSVAWGLGTSGLVNYTFLL 

NSGETIYQFLRGNKDFIJCNHIKJLNYCFLLI 

EVDNLTLVFVIEKTLGQIFDIPKVELLFSYQ 

CFPMVENRQKPEGEEDCVIQLSELSCTECS 

KKAWRMEVIiiThnCTTNATQCGGPAQLQQ 

FNAVLSEKVHIVPSLLRSWNnSHGRFPSFE 

TFhHTCNCIAYNPNGNALDESCEDKNRYIW 

LEKPQETYSNDRRESKMPLRMAAERRRAE 

QKEKYPLTKSSDLGASEAIRQRQSSAAKLR 

KSGKESVREPWARVPGALGVAARALIAED 

AGLSRVILFHYGESWNLLRADQRLIFAKS 

WPRASRYQQGHQDLFILRSDLPSQVFIRDK 

LMERRNRRTGRTEKARIWEVTDRTVRTWI 

GEAVAAAAADGVTFSVPVTPHTFRHSYAM 

HMLYAGIPLBCVLQSLMGHKSISSTEVYTKV 

FALDVAARHRVOFAMPESDAVAMLKQLS 
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Table 8 




SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 


2772 


C 


148 


306 


MRPCCWWATLCGKHLRMCSHALKMRPN 
ASAAETEQLNAHSRGLMNSSSRPAP* 


2773 


A 


2874 


3062 


GNRAGALPGATLLILAGFLPSAHQNRPSRN 
PVSRPPNTQRVARRKHY ALADGYTERRWT 
NAP 


9774 




1 


660 


MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPV 

AQYPTIAPPAVTISASYPGADAKTVQDTVT 

QVIEQNMNGIDNLMYMSSNSDSTGTVQrr 

LTFESGVQVQNKLQLAMP/LLPQEVQQQGV 

SVEKSSSSFLMWGVINTDGTMTQEDISDY 

VAANMKDAISRTSGVGDVQLFGSQYAMRI 

WMNPNELNKVERNSRRQDVGERDISSGSR 

KVNKESREDEEVT 


2775 


A 


78 


264 


PVERSNLGVRLYACCGLLLRPAYPQHFAH 
GYVDKIPDYPRRAGTLTGLHPMQVCRCRR 
AREL 


2776 


B 


1 


921 


MLDDYGGSLSELAREQLPAAEQAALAQLA 

ARSLAPVPDDTGGAGMSNDTPFDALWQR 

MLARGWTPVSESRLDDWLTQAPDGWLL 

SSDPKRTPEVSDNPVMIGELLREFPDYTWQ 

VAI AD LEQSGRIGDRFG VFRFPATLVFTGG 

NYRGVLNGIHPLAELINLMRWLVEPQQEL 

HQPLTTVQNANDCCCDGACSSTPTLSENV 

SGTRYSWKVSGMDCAACARKVENAVRQL 

AGVNQVQVLFATEKLWDADNDIRAQVES 

ALQKAGYSLRDEQAAEEPQASRLKENLPLI 

TLIDS S YFPHGTELAF 


2777 


A 


47 


275 


FPCPPAPHVCGPPPCPRAFPVGQSSSQPQV 
ATGFP*SPVCPPPRLYWGPGTERHWVETH 
YRAFLPSQHLSSPVTAA 


2778 


A 


749 


1020 


VLVRDPSQPAQPFSVSFSPQKHRDEKLYFL 
PKGVSGGSELRGRPQPYLPCPVSPTLCPWG 
HLSLAPPSVPPTACESSSELWPSLSWTWAE 


2779 


A 


271 


86 


MPLHTCLVHVGVSHAARGSPVCPSVLWV 
WFCVHFQVIHMWAHECVQADVWAHIQD 
CAQVCV* 


2780 


A 


3 


523 


AAANRKRAAYYSAAGPRPGADRHGRYQL 

EDESAHIJDEMPJJV1MSEEGFENEESDYHTL 

PRARIMQRKRGLEWF VCD GWKFLCTSCCG 

WLIMCRRICKELKARTVWLGCPEKCEEKH 

PRNSIKNQKYNVFTFIPGVLYEQFKFFLNL 

YFAVISCSOFVTALKIGYLYTYWAPLGF 


2781 


A 


2 


141 


EQFLRRQIASEKEEIERLKAEIAEIQSRQQH 
GRSETEEYSSLLLQF 


2782 


A 


3 • 


402 


GNGGFWHWLNNKEFHFTSSTEVFMHQLR 

KLSDKQVDHENDDADREDEEHSQEDRER 

GLHMKLDHDLSLDRESEAGTGSSEHEDGE 

REGSPRTYSRLSVPMPLPTVLLDRKIETLLT 

EWNKNPDMLFTIHPMY 


2783 


A 


333 


695 


ISVFRSPGQSTSQHDAATWPFLIHSGEGPTP 
SRRKAPPAFHPHTOACPSTCYCHTLASRRG 
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SEO 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *-Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PCNGRYHRPVYPHPTAMQRDPPAGPRGCQ 

SPCVraYTPACRHPCGRHYR*HGQHDPPPW 

Q*HC*FGSPGQSTSQHDAVTWPFLHIPGER 

PTASRRKAPPAFHPHTQACPSACYCHTLAS 

RRGPSNGRYHRPGYPHPTAVQRDPPAGPR 

GCQSPC*HEPPACRHPCGRHYR*HGQHDPP 

PWO 


2784 


A 


91 


297 


MSLVKLFNLLVFSYRRGAVITIKIEVKIKVT 
YVKCQAHGERLINGHYDYSACHVIKLMFC 
AEEKKPHQ* 


2785 


A 


2 


103 • 


TGEKWPGEVNPPNGPVGDPLSLLFGDVTS 

LKSFD SLTGCGDILAEQDMDSMTDSMASG 

GQRANRDGTKRSSCLVTYQGGGEEMALP 

DDDDEEEEEEEEVELEEEEEEVKEEEEDDD 

LEYL*EGSTRRGKPTQWPCGGPTEPLVWG 

CDIPEKL 


2786 


A 


24 


332 


QPQYIAPLMANFDPSVSRNSTVRYFDNGT 
ALWQWDHVHLQDNYNLGSFTFQATLLM 
DGRHFGYKEIPVLVTQISSTNHPVKVGLSD 
AFWVHRIQQIPST 


2787 


A 


210 


281 


FHHKQLHNPVLECHQPAGPCHYL 


2788 


A 


2 


1211 


WTPPGAPGAKGPRQGGCCSGLLRPPRVSG 

KTCGARPPWPWRSLSRIPKREGLGEEDTA 

VAGHELLLPNERSFQNAAKSNNLDLMEKL 

FEKKVNINVVNNMNRTAIJIFAVGRNHLS 

AVDFLLKHKARVDVADKTRMRELLLEIFL 

TVPRAQFHDLHCLESKLEDCEMRDTLRHM 

QAVYRETNILTHTVTCVRLGALSYLKTMA 

CRPQQMLSDK>JMDSVLTSYMNUjKIJHNL 

SVLQFLYLKNEDKNSTYVNLILSERIPTLIF 

QIQKPKYREVMQLAQMLWLALTLFSFTV 

VVLNSIRAMVPSERIFKAKDLLSRKIHIHIY 

DKNIAYESAVPIMPVIPQTGSPTYTSSAALP 

QCLTPGNTIHSVAIVNGSSWSSALRSQCDH 

RLHTCSFTLVPORHPHTQLI 


2789 


A 


1 


334 


FRANRTVKDAHS1HGTNPQYLVEKTTRTRIY 
ESKYWKEECFGLTAELWDKAMELRFVG 
GVYGGNDCPTPFLCLTIJKIMLQIQPEKDIIVE 
FKNEDFK*VQCSLAN1RGMY 


2790 


A 


3 


1794 


AMLPMELGCGPLPEPLPVGCSRFSLFK*QT 

CISTVP/GYMVTAQSMSSTPPPPSPSTLPSSP 

SPPPPLPQPLPPPPPSPPTLSSLSSPSPPRPPL 

VSPSTLPSPQPSSPQPLLPPSSSPPSLPSPPPP 

SPPLPSPSPSAIPSLPPPSPQPLPPPPPSSPPPS 

LPSPLLPPPPLSSSPSSPLSPSPPPPSPPPSLPP 

SPPPSPPPPPPPQPPSPPSSPLSSPPLSSSQPSL 

LPPSLSSLPLPSSPSPLLPLSLPLSISPP*LSLL 

SPLPPSPSLPPSSFQST*TIGQCFSL/VMWHV 

APCTYLAIAGNTIMAWPLMSASSKASGG 

VSMFVWRNVEPCSVAVFSWYSVPFLTPPC 

SRVRPSNLPVTQWPPTRAKNLPSROLLLTS 
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Table 8 



SEQ 

ID 

NO: 


ivieuioa 


rTeuiClcU 

beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
de!etion,=possible nucleotide insertion) 










VHQAQSLSALCKEQDSSSEKDGRSPNKWD 

KDHIWWPMSGGHDLQQAAPGPGRAHQGH 

PYQDNWTISQILSERWYTLGPNEMQKYHD 

LAFQHMAGEDIASDEEHMVIHEEEGVMVS 

LLMTALAPLTLISSSR1FGKVYGPTPSSSYT 

YSD ASSSTLAPTSFLLGPGAFKAQESGEEA 

EDGLRELETEKALSSSURRALDQ/*LAL1M 

OLFOAHCFFLST 


2791 


A 


230 


2579 


A1CDPCYWRMEKSPRMMEKKLSKGMIPD 

WESRWENKELSTKKDNYDEDSPQTVT1EK 

WKQSYEFSNSKKNLEYIEKLEGKHGSQV 

DHFRPAILTSRESPTADSVYKYNIFRSTFHS 

KSTLSEPQKJSAEGNSHKYDILKKNLPKKS 

VIKNEKVNGGKKLLNSNKSGAAFSQGKSL 

TLPQTCNREKIYTCSECGKAFGKQSILNRH 

WRIHTGEKJPYECRECGKTFSHGSSLTRHLI 

SHSGEKPYKaECGKAFSHVSSLTNHQSTH 

TGEKPYECMNCGKSFSRVSHLIEHLRIHTQ 

EKLYECRICGKAFIHRSSIJHHQKIHTGEKP 

YECRECGKAFCCSSHLTRHQRIHTMEKQY 

ECNKCLKVFSSLSFLVQHQSIHTEEKPFECQ 

KCRKSFNQLESLNMHLRNHIRLKCDFYLM 

NAIYVGKPLVIGHPCFNTTEFILERNLTNVL 

NVGRPSAWQTLPYIREFILEKSHINWSVG 

KLLAKAQILLPIKEYIMERNPIVWEPLQPVV 

SRQALGHQAGESRGHTQRCKVTRLSSWQ 

VLVGAAVPCSGARDRVPVPRHVPQACLQG 

RVQTGRLDWRGHACSASPNAVPTVTFSDV 

AIDFSHEEWACLDSAQRDLYKDVMVQNY 

ENLVSVGLSITKPYVITLLEHGKEPWMVEK 

KI^KGMIPVLEVLARAMRQKNEIKGIQLG 

KEEVKLSLFADDMIVYLENPIVSAQNLLKL 

ISNFSKVSEIPKSIvr^NHKAIT.YT>mRQTE 

SQIMSELPFTIASKRIKYLGIQLTRDVXDLF 

KENYKPCSTK 


2792 


A 


154 


331 


1PAAATCMGSLLGG*ETPGLWARRSVKSR 
GLFPGLPSPSRASVRSLXLLPAWAAFLEGIV 
DTRPTAWRAi^WTLFLSVFCQFLDFPETSL 
DSOKLSLDTPSF 


2793 


A 


213 


446 


ILLQRSLGVGGHRAWGIQEPSKVLVSGRRT 
EAPSMLQMGRQMWGRTSWRWTRTWRCG 
WPWGGPLAARHVSSCTKOGH 


2794 


A 


515 


278 


IFTLFDKLSSQIPSILRSQYQSCLYDPSQPWP 
PPTSDAHDHKHGPHIAPPPPLPCLLGLASPF 
RSIQYISARPQLKGPF 


2795 


A 


1 


708 


VTAGVPKGHCPRRGTSSAIASCPPYGSPPR 
AECALRAGSTVTT*RRSCCTSYSSGRPPTG 
RRGSWTLVCTSCCASWRRACSRRSSTSSSS 
ATARCLRPWDSLRPCSGPSPSTSSGPSSCSE 
AFTVRHWTPSMRCSRMPQRSLASIPYMS/S 
SDOPTPKS*RLLONVGSSS*DEGIPHVHTPG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
cod on, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










GICQPCSGDKAGFRGSRAQPARKPSPTVQR 
KONFNGKLVCFIPLGSAGKAVIWV 


2796 


A 


2 


590 


FQGRGLAAND GEYLKM2 w 1 v L/vr a 
CPLSTI^VLSSPPRELQAMEALQNGQTTVE 

GSIEGQSAGAASHAMIEKILSEEPRWQETA 

YVLGNYKTEPCKKPPRLCRQGYACPYYHN 

SKDRRRSPRKHKYRSLG/TQEASHGREEW 

QGRGQAEAAPTGSPGGGEAGPGDDRIASP 

GPRGCjrribbDibW 1 VUA^LnLLnD 


2797 


A 


319 


513 


IELRAVAQGIAQSLGQLLFTQCPLEKKDLE 
GLFLQNNKEGVQKGRDEPLPPLP * ATALSS 
IQAGIQQAR*EGDLEAWQFPVRHTPPDQQG 
NIIVTFEPFPFKLFKEFKQAVNQYGPGSPFV 
MGLLKN VA VS SWMIPTD WD ALTRACLTP 
AOFLOFKTWWADEAGRV 


2798 


A 


1 


915 


MSTAVWKVVLCTVAPGRGSAPSLSSCLD 

WKVNGAEGSHNKDLFVLTYGALVAQLCK 

DYEKDEDVNQYLDKMGYGIGTRLVEDFL 

ARSCVGRCHSYSEiroilAQDMERGFCALHI 

DTEGRYEWWTTSTQLQSTLPRAAv^Cb viy 

KQPDRKSLTVGQKIEVGNPGIGTEQSPQGL 

VRFATQAFLTTHRAEGLQQSQVKGSVIHL 

KSQDKCGEHRFTTNQVETGDPVRESSSQH 

SVGRGGPKDIQIQGANVPVRQCNLLWRITL 

GPLETPHLEFSGECSLLAAMEAPEHTWDQ 

EKSDIPEPPHRSS 


2799 


A 


75 


642 


EKLLNPQT1 SFFLQLLQKKQW YPKM* FOUL 

PSQGLLPAARVQKCLLVLRNVSGSPFPFLI 

GFPPPBLELKESYPWAGTDIQCEPAQGHVL 

TSPSPTLRXLQGAPDLPAGEPAWLLLTAREE 

DDG*NFSC*ASLWQGQRLMKTTVIQLHIL 

CEWRPDLSCQNKDYYFPISRELLGQQCFDT 

VATFFSL 


2800 


A 


1 


1146 


MVGECGTKLEVMQVHLSNPRDELEGELRS 

IRVTMGQVWALVHSTLEPFHTNEEEEGLY 

NKVTEEVTEQVCLPAKAKAAKEGEVHPYP 

SPFPHYFEETEWPDPPDLSFLEDTGGDPSLT 

SHWQLTKEAEAELQLIEKQVHKAQINRIDP 

EKEPDLLIFSTQHSPTGVIVQEQDLVEWLFL 

PHTNS WTLTP YLDQN A 1 MlvjrN JdK i kh v jvl. 

HGYDPRIOIVLLMXANIQQAFINGLTWQTH 

LANFVVILDNHFPKJV1KLFQ 

ITKFKPIKGAENVFTD GSSNGKAS YSGSKG 

LSQQLIWISSRNLKPYHESDAEEEIPGRTQG 

TPGCSHVETDTEEDPNCHEQHPLNTATHL 

GTDQEAVTDGGRKPEERGTTSHNE 


2801 


A 


2 


926 


RPEPSCRPRSEYQPSDAPFERETQYQKDFR 
AWPLPRRGDHPWIPKPVQISAASQASAPIL 
GAPKRRPQSQERWPVQAAAEAREQEAAP 
GGAGGLAAGKASGADERDTRRKAGPAW 
MVRRAEGLGHEOTPLPAAOAOVQATGPE 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










AGRGRAAADALNRQIREEVASAVSSSYRN 
EFRAWTDIKPVKPIKAKPQYKPPDDKMVH 
ETSYSAQFKGEASKPTTADNKVIDRRRIRS 
LYSEPFKEPPKVEKPSVQSSKPKKTSASHK 

•rvnTh-rr a trT\x/T\ A \rOf~±T\ A A VFVC AT?f"JPQT r FK'P 

PTRKAtCDKQA V AAJsJU<^ AJfcLir o 1 1 js^r 
DDKEQSKEMNNKLAEAKE 


2802 


A 


25 


435 


TKYWLLLFFLILILPFFFWRRSRSVTQAGG 

QWHDLGSLQPPPPGFKQFSCLSLPSSWDYR 

RAPLHLANFYIFSRD/MDFTMLARLVSNSR 

SQ/CDPLASASQSAGISGKSQHTRPVLVLLK 

TYTNSH/SF*VKGLGWEFIL 


2803 


A 


1186 


1074 


TAAARRSSRTSSHRSLLHVPENLATGPSEF 

RSPGFLLSRVPSVWDPTENRTVQLTWQPLP 

EPLELWPKA/HLTDSFPDLLGLAAED*HCPI 

ASEAP*TITDAELRVTLTVEGKPFPFLINTE 

ATHSTLPSFQGPVSLASITWGIDGQASKPL 

KTPQLWCQLRQYSFKHSFLVTPTCPVPVLG 

*DTLTKLSASLTIPGLQLYLIAALLPNPKPPL 

RPPLVSPDLNPQV*DPHSCPPENKPPLTVIF 

LYLPKSYKTAPPHLPLLTLFSDSARLHPGEI 

NSHVAHTKPVWWSLHTDAHEIWCRHSDR 

GTSLGRSIPCPPALCSMRKIHLRPQVLRQTS 

PRNISPISNPVSGLFLLSSPTCLTIPQPLSPFN 

LGATLQSLPSLNFNSFHFLVETKETRFICGP 

KTPALVTDWEGSLPLNIFNHCRDTSLlffiUPC 

FQGVRPCRDACLSPSPLAASPAFLGKGQVP 

LNPFFTLSGKSRFSGGGASTPTPSFHVSTPS 

LLFWGRGKYPSTPSSPLVASPAFLGKGQVP 

LNPFSFTLSGKSHFPGTGARFN 


2804 


A 


3 


810 


GVSPCWPGWSRTPDFGSNPKCPPIRASPGA 

ELQAI^STVTTPYWGILVTAVFPH*GLRPR 

QCRQDHPAGRQGPGPGEVPEILGQSGCTD 

RTWSKAGGRTQAPGPRSRAGRRVSGQEIR 

APGPLGCRHGG/VGAPWTPEAASPLTATEP 

SCPH/LQAPCGYMPLSVSPRRRYRGPAGDQ 

KVKMLKFKAFCLDYWQFLCLQPLHGAYK 

RDSDLMTWIWGLLPEVTGAAGTTSPNVHT 

SGRFFRACWCPVHTLVKKEPHPGC^blllvi 

EPSPWSP 


2805 


A 


62 


475 


FEPLFYLMCLLNLFPLQLPRHPFLFLTVDLV 

NTWGCPLPSSPQ*EWLLAAPHRSTPPPLSS 

GFPARRQLEPGAGARGP/HHTQALHLSFFF 

VFLRRSL/DSVAQAGVQWRGLGSLQPLPPG 

FVULSSPLSLPSLTY 


2806 


A 


3 


4BU4 


KRT FNTOKTT EV AFSE AVWMOPS V VLLDD 

LDLIAGLPAVPEHEHSPDAVQSQRLAHALN 

DMKEFISMGSLVALIATSQSQQSLHPLLVS 

AQGVHIFQCVQfflQPPNQEQRCEILCNVIK 

NKLDCDINKFTDLDLQHVAKETGGFVARD 

FTVLVDRAIHSRLSRQSISTRE1CLVLTTLDF 

OKALRGFLPASLRSVNLHKPRDLGWDKIG 
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Table 8 



SEQ 
H) 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *==Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










GLHEVRQILMDTIQLPAKYPELFANLPIRQ 

RTGILLYGPPGTGKTLLAGVIARESRMNFIS 

VKGPELLSK YI GAS E Q AVRD IFIRAQ AAKP 

CILFFDEFESIAPRRGHDNTGVTDRWNQL 

LTQLDGVEGLQGVYVLAATSRPDLIDPALL 

RPGRLDKCVYCPPPDQDGSSSSDSDLSLSS 

MVFLNHSSGSDDSAGDGECGLDQSLVSLE 

MSEILPDESKfTSnVTYRLYFGSSYESELGNGT 

SSDLEDESMNQPGPDCTRLAISQ SHLMTAL 

GHTRPSISEDDWKNFAELYESFQNPKRRKN 

QSGTMFRPGQKFFDEITELTYLPSFHHKAA 

PHQAEPGPNSSSASAPPPYNPFITSSPHTQS 

GLQFRSVTSPPPSAQQFPLKEVAGAKGIVK 

TALETAPTLALPVSSQPFSLHTAEVQGCAV 

GILTQGPGPCPVAFLSKQLDLTVLGSPSCL 

HAVASAALILLEALKITNYAQLTLYSSHNF 

QNLFSFSHLTHILSAPRLLQLYSLFVESPnT 

ILPGPDFNIASHIILDTTPDPDDCMSLIYLTF 

TPFPHISFFSVPHVDHIWFTDGSSTRPDRHS 

P AKAG Y AIES STSIIE ATALPP STTS QQ AELI 

ALTRAFTLAKGLFIVNIYTDSKYAFHILHHH 

AVIWAERGFLTTQGSSIINASLIKTLLKAAL 

LPKEAGVTHCKGHQKASDPITLGNAYADK 

DRTIDGSSQVIEEKNHNGYSVIDTGTLVEA 

ELEKLPNNWSPQTCELFALSQALKYLQNQ 

KTISHJQKEPSPALGLTPERKGNVGHAGKG 

PLESSSPDPFLCGQERREKGCRTATSVSITN 

PINRGPWVVTHPGKELTPEHKGNVGHAGR 

DILAKAGAIIHLNIGEGTPVCCPLLEEGINPE 

VWATEGQYGRAKNARPVQVKLKDSTSFP 

YQRQYPLRPKAQQGLQKIVKDLKAQGLV 

KPCSNPCSTPILGVQKPNRQWR\TLCHQAT 

QALFNFLATCGYMVSKPKAQLCSQQ/RYL 

GLKLSKGTRALSEEHIQPILAYPHPKTLKQL 

RGFLGVIGFCRKWIPRYGEIARSLNTLIKET 

QKAOTHLVRWTTEVEVAFQALTQAPVLSL 

PTGQDFSSYVTEKTGIALGVLTQIRGMSLQ 

PVAYLTKEIDWAKWAVAVLVSEAVKHQ 

GRDLTVWTSHDVNG1LTAKGDLWLSDNC 

LLKCQALLLEGPVLRLCTCATLNPATFLPD 

NEEKIKHNCQQVISQTYATRGDLLEVPLTD 

PDLNLYTDGSSFVEKGLRKVGYAWSDNG 

ILESNPLTPGTSAQLAELIALTWALELGEEK 

RANIYTDSKYAYLVLHAHAATWKEREFLT 

RGHQKGKEREffiENCQADIEAKRAARQDP 
PLEMLDCQPLV 


2807 


A 


1 


591 


MTPRGTGGDSEVPFQAAKPLSVKQGVSFR 
LWARRRPRCDFLRSSRIRVHPTPAASTMPP 
KFDPNEIKWYLRCTGGEVGATSALAPKIG 
PLCLSPKKNROAQIEVVPSASALIIKALKEP 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PRDRKKQKNIKHSGNITFDEIVNIARQMRH 
RSLARELSGTIKEILGTAQSVGCNVDGRHP 
HDUDDINSGAVECPAS 


2808 


A 


1094 


483 


IGCDVLINNAGIFQCPYMKTEDGFEMQFGV 

NHLGHFLLTNLLLGLLKSSAPSRIVVVSSK 

LYKYGDINFDDLNSEQSYNKSFCYSRSKLA 

NILFTREIARRLEGTIWTVTWLHPGIVR^ 

LGRHWFHCWSNHSSIW/WSWAFFKTPVE 

GAQTS1YLASSPEVEGVSGRYFGDCKEEEL 

LPKAMDESVARKLWDISEVMVGLLK 


2809 


A 


1775 


1981 


HI WQN SLIVIJFRGCRS AHAK VHR WKN * LP 

LNLAPLLPRSGSSAPIRPPPSAQARQPMKST 

YGVDRRHS 


2810 


A 


272 


51 


MLLLSSSLLKCGTCQWQVQPAVAGSLEGG 

EEESMVSALUSALPFLGTSHVTVETLDVQ 

YTVFPKLICFLPCE* 


2811 


A 


3 


357 


FGFNGCSKRIIKLQELSDLEERENEDSMVPL 
PKQSLKFFCALEWLPSCDCRSPGIGLVEEP 
MDKVEEGPLSFLMKRKTAQKLAIQKALSD 
AFOKLLIWLG/ODCLDHP*STSVSVSK 


2812 


A 


94 


3006 


RTRSLTRKAMAEHAPRRCCLGWDFSTQQV 
KWAVD AELNVF YEES VHFDRD LPEFGHV 
LDVHGVHVHKDGLTVTSPVLMWVQALDn 
LEKMKASGFEFSQVLALSGAGQQHGSIYW 
KAGAQQALTSLSPDLRLHQQLQDCFSISDC 
' PVWMDSSTTAQCRQLEAAVGGAQALSCL 
TGSRAYEFNLVCDRKHLKDTTQSVFMAGL 
LVGTLMFGPLCDRIGRKATILAQLLLFTLIG 
LATAFVPSFELYMALRFANGLLPSLDLASA 
MSPY *QNGWGPHGGRRP WSWPSATSPSGR 
WCLRDSPTVSATGGSFRSPALRLAYCSS\LL 
LGSARICTLAPDP WEDGRGDTTDPENGLG 
Q * AETLPGAHEP AGPREDRPLRECPGS VQT 
PPAPEGDPDYLLCLVCGQSGVLRPEPPSGG 
LRPGRLSDAAHLWSC*GACPLFQHLHDAE 
VWPQVEP/RWGPWSWVA*CVSSSSSSQQIC 
PWWSPCWLWWGKWPQLLPLPSPMCTLPS 
FSPPSSGRQAWGWWASSHGSGASSHHL*S 
CWESTTLPSPCSSTAASPSWPA/SLCTLLPE 
THGQGLKDTLQDLELGPHPRSPKSVPSEKE 
TEAKGRTSSPGVAFVSLGTSDTLFLWLQEP 
MPALEGHBFCNPVDSQHYMALLCFKNGSL 
MREKIRNESVSRSWSDFSKALQSTEMGNG 
GNLGFYFDVMEITPEIIGRHRFNTENHKYF 
KGKGAPGHPMPSLKANFDLLACLRGVGSS 
TLLLWPAVLGAQTRQAGVNEGRSQVADF 
LRIPVTGCPEQRRNPPSPPAPLGTGGPAEER 
LQFPGVAGSRRGRGRILRAGGIGRASPGEG 
TGAPRPRAGQGRGGPGKPESGGGGPVALR 
PGDCTCCVLKSQPRQQRRGACSAMAFRVR 
LRVRQSVRPPRGVIVAALORPETQGPAPSS 
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Table 8 



SEO 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ARPDCGPESRGGLALWRKLRGYASRDRVL 
OS^RCPHAAKFPSKRTPSGSPHLHLMSSW 

AVP 


2813 


A 


1 


897 


MTYGVGKGDMVDGTKERGERIESALGTS 

HIMRVAEPQGSQSWCPDEELRPVGSPATA 

AQKLPSTPGALGPTHSTECCSEPLDPKAQQ 

GU^KIVKDLKAQGLVKPCNSPCNTPILGVQ 

KPNGQWRLVQDLRIINEALVPLYPAVPNPY 

TLLSQIPEEAEWFTVLGLKDDFFCIPVHPDS 

QFLFAFEEPSNPTSQLTWTVLPKGFRDSPH 

LFGQVLAQNLSQFSYLDTLVLRYVDDLLL 

AARSETLCHQATQALLNFLTTCGYKVSKP 

KAQLCSQEVTYLGLKLSKGTRALSEERIQP 

ILA 


2814 


B 


71 


2167 


XPAEAIXDGEERQKNTCKXAKKIKARMNF 

RAKEYESLMETKNSGSDSPYKAKLQRLAK 

DLLKQVQVQDSGSWANNKVSALDRTLGEI 

TR1LEKENVADQIAFQAAGGLTALEHILQA 

VWATNVNTVLRNSSMPQDSYMQCVTLCF 

AVTGRSYSIFDNNRQDPTGLTAALQATDL 

AGVLHMLYCVLFHGTILDPSTASPKENYT 

QNTIQVAIQSLRFFNSFAALHLPAFQSIVGA 

EGLSLAFRHMAS SLLGHCSQ VSCESLLHE V 

IVCV GYFTVNHPDNQGDRA VRPPPHSAAK 

SSASCPSSISVTHG 


2815 


A 


1 


473 


EVRWNSPPTDSLSPDGGSIELEFYLAPEPFS 

MPSLLGAPPYSGLGGVGDPYAPLMVLMCR 

VCLEDKPIKPLPCCKKAVCEECLKVYLSAQ 

IQCPTCQFVWCFKCHSPWHEGVNCKEYKK 

GDKLLRHWASEEEHGQRNAQKCPKCKIHI 

ORTEGCDHM 


2816 


A 


1 


1286 


RGAVFPGPEHSVPEESVTFEDVAWFTDEE 

WSRLVPIQRDLYKEVMLENYNSIVSLGLPV 

PQPDVIFQLKRGDKPWMVDLHGSEEREWP 

ESVSLDWETKPEIHDASDKKSEGSLRECLG 

RQSPLCPKFEVHTPNGRMGTEKQSPSGETR 

KKSLSRDKGLRRRSALSREILTKERHQECS 

DCGKTFFDHSSLTRHQRTHTGEKPYDCRE 

CGKAFSHRSSLSRHLMSHTGESPYECSVCS 

KAFFDRSSLTVHQRIHTGEKPFQCNECGKA 

FFDRSSLTRHQRIHTGESPYECHQCGKAFS 

QKSELTRHQLIHTGRKPYECNECGKAFYGV 

SSLNRHQKAHAGDPRYQCNECGKAFFDRS 

SLTQHQKIHTGDKPYECSECGKAFSQRCRL 

TR HOR VHTGFKPFECTVCGKVFSSKSS VIQ 

HORRYAKQGE) 


2817 


A 


94 


255 • 


MLYIECKSHKLVAPLAVFFALFFLLIFFWV 
AFSYPFELLFLOLRSRQADIGVO* 


2818 


A 


551 


19 


TGTIDKLQGSGPHLLRDWAFHPPWRKICL 

HCKCPQEEHMVTVMPI^MEKTISKLMFDF 

ORNSTSDDDSGCALEEYAWVPPGLKPEQV 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










HQYYSCIPEEKWYWSPGEKLRIKQLLHQ 
LPPHDNEVRYCNSLDEEEKRELKLFS SQRK 
RENLGRGNVRPFP VTMTGAICEQVSMD SG 
Y 


2819 


A 


236 


559 


MWLEPMQMGFLHMMEKMAARTS AILD *G 
TIX*FHFTLTTSLKALSSHTPIFPGTGELQLP 
VSPSV(^QGMQIJa > STSSHLLKTVKPRM 
KRQSLLHMKQSFEPKIYL 


2820 


C 


209 


592 


^fETETKESGKNKKIPPKHQIENVGVGGLG 
AQD GLNQIGKIPP VLS CS Q SRFGTMP AAFP 
CVFPPQSLQVSPQMSSKAWEKQSLPLPGLR 
GSPX^RKNRNYDLCI^YCIJCNIFNCRGKPV 
LFWRICANR 


2821 


A 


381 


55 


PASLPPCSLISDCCASNQRDSVGVGPSEPGV 
GYSLVVRRFLSRSEKRNIRVGVTRFSRCV/L 
SPLSLTQKGNSLTPCASQVRQCLALLRLAH 
GACTHWPAPTVWHSLVR 


2822 


C 


2 


166 


MQKRHNCKKVHALPPAVLGFQRASGCRF 
ANKRSRITHFGGRRLSLTPASDSAGV 


2823 


A 


164 


423 


RGPVSRNQPPFTRFPQTRKTTETHVRGQSL 
PRPGTQSLQTKAAQVPSPQRLPKNPE*AV 
WLTOAPNAHPN* VARETPNCQTKS STR 


2824 


A 


792 


389 


PTRPPL\QLQAPRAHLSEDQKRLLLMKQKG 

VMNQPMAYAALPSHGQEQHPVGLPRTTG 

PMQSSVPPGSGGMVSGASPAGPGFLGSQP 

QAAIMKQMLIDQRAQLEEQQKQQFLREQR 

QQQQQQQQILAEQVTCPLA 


2825 


B 


1279 


1479 


MVPLCQVRVAGVRAGLALVSRTSPLAPNL 
AGVLGSGAPPPPPPGPSCLRALLRLPQQKS 
GPLRELLSAHGSKDGLWKAPTHFYDHLF 
PRLFVLMKLKF 


2826 


A 


1 


412 


MKALLALPLLLLLSTPPCAPQVSGIRGDAL 

ERFCLQQPLDCDDIYAQGYQSDGVYTJYPS 

GPSVPVPVFCDMTTEGGKWTVFQKRFNGS 

VSFFRGWNDYKLGFGRADGEYWLGLQNM 

HLLTLKQKYELRVDLEDFEN 


2827 


A 


3 


711 


KIADFGFSNLFTPGQLLKTWCGSPPYAAPE 

LFEGKEYDGPKVDIWSLGWLYVLVCGAL 

PFDGSTLQNLRARVLSGKFRIPFFMSTECE 

HURHMLVLDPNKRI^MEQICKHKWMKL 

GDADPNFDRLIAECXJQLKEERQVDPLNED 

VLLAMEDMGLDKEQTLQSLRSDAYDHYS 

AIYSLLCDRHKRHKTLRLGALPSMPRALGL 

SSTSQYPXAEQAGTAMNTSVPQVQLINPENQ 

IV 


2828 


A 


1350 


2203 


TWRLDPQHS SPKPQPGGTYTLE WKSSKSK 

KVLSPHP*WPPLRLWQR\GGSPEGGTQAPD 

GSLPPPPPRPKSERVGSPKLSGGKR/EGSHP 

GGPPHITHP/DGEEKAKSSWFGLREAKDPT 

QKPSPHPVKPLSAAPVEGSPDRKQSRSSLSI 

ALSSGLEKLKTVTSGSIQPVTOAPQAGQM 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










VDTKRLKDSAVLDQSAKYYHLTHDELISL 
LLQRERELSQRDEHVQELESYEDRLLVRIM 
ETSPTLLOIPPGPPK 


2829 


A 


2 


259 


WQGGILGSDPTPPLTSPNLLQTACFREERD 

V/RRERGQPLGDHSALCLPRRGVPVPCDGL 

LCWWGPPDAAEPLRGPSPARAGPVLPG 


2830 


A 


1 


1062 


MTADAVLIKNGSKDADWEYEEGDKLEEFL 

RSLNSSKPLYLGQTGLGNIEELGKLGLEPG 

ENFCMGGPGMIFSREVLRRMVPHIGECLRE 

MYTITODVEVGRCVRRFGGTQCVWSYEG 

RCSFRVWDSAffiFSMDFEKILMLDPTLHPL 

CQNLLQRLNTMWKPPNVGLVPSKATAQA 

VRWSLLAMARAGAATMPGALSQGCIEVS 

RLIXKLPDDEGITMDTVGFAPLCLWQRLT 

LANHQRYFADGPQPVCNHMQPAPHHFAS 

MRS S AA SPTSLP AF AD P AA VPPLEH VY VW 

TLLLCQRWCITMYMDSTATTLTKHCCCPP 

PIPPIGVLLPADWGHIGPSSDSRSENKAMGS 

SPST 


2831 


A 


2 


238 


TKLNPKIMDVGWPELHAPPLDKMCTICKA 
QESWLNSNLQHVWIHCRGGKGRIGWISS 
YMHFTNVSAR*DEDVSSLS 


2832 


A 


3 


162 


RLHTANLGD SGFL VVRGGEVVHRSDEQQH 
YFNTPFOLSIAPPEAEGWLSDR 


2833 


A 


1 


988 


MPAEFFQRCSVTMVQLPWKEAHVERPHGE 

RD YTPDLQPDMWEKFP GLRRALRP WKTL 

LVQLEYRQAEKCEKRDWPSLPDYIFLLCW 

MLPALEYRTPSSSVLELRLALRAPQPADSL 

LWDLVIVPITSLKSWQTPRGEVEGVTHEEI 

CASLKSLAVALLSMSDLTVGTPVTQPQTL 

NTMGIIGSRGGRGQVAALNRQRQVPELIIGI 

DILSSWQNPfflGSLNGRGYINSLALCHNLIR 

RDUDRFLIJQDITLVHYIDHIMRLDSVKDK 

WLHLAPPTTKKEAQCLVGL/FGFWRQffiSH 

LETAL/RP VTGLWWKLNI *LW AIKSPCNLN 

CLS 


2834 


A 


4061 


2827 


EAGPAPLSAAAPGAGRGWPRPLAERRKGR 

GRRQPLRARLNRRRWAAGQGSTVQAATF 

GPAMAAAPLKVCTVGSGNWGSAVAKIIGN 

NVKKLQKFASTVKMWVFEETVNGRKLTDI 

INNDHEK^KYLPGHKLPENVVAMSNLSEA 

VQDADLLVFVTPHQFTHR1CDEITGRVPKKA 

LGITLIKGIDEGPEGLKLISDIIREKMGIDISV 

LMGANIANEVAAEKFCETT1GSKVMENGL 

LrivJb.Ll,v<) I FINrKll v \\JDt\iJ i v cjlv^aj/vl-jvin 

IVAVGAGFCDGLRCGD^riXAAVIRLGLNlE 

MIAF ARIFCKGQVSTATFLES CGV ADL1TTC 

YGGRNRRVAEAFARTGKTIEELEKEMLNG 

QKLQGPQTSAEVYRILKQKGLLDKFPLFTA 

VYOICYESRPVOEMLSCLOSHPEHT 


2835 


A 


106 


1814 


QLLPTDTPTGNSSPSLPHLPFAGACGLSIYN 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
de!etion,=possibIe nucleotide insertion) 










LWTQQKRNPSGSSGFILSWCFTNYSPVPPS 

LQMFFRLQLPPVNSEETSHYEIPLPGRRVEL 

RYPLRQGTEATDGQVCGNEDMLERDRVRK 

TRGSAPPPAHNLAPTEVALEDVLRIFTSAW 

RGVDGALEKGGTSCPARAQLPAEPEDPLF 

RCLRVSRLKDREVRGLGLPRQLQGVWSTT 

YPRRHAIAEHAGSPKPLRKREPETWQANK 

KGVIGIQLVVTMVMASVMQKIIPHYSLAR 

WLLCNGRKYNGfflESKPLTIPKDIDLHLET 

KS\nTEVDTIAXHYFPEYQWLVDFTVAATV 

VYLVTEWYNFMKPTQEMNISLVCKVLFS 

LTTHWKVEDGGERSVCVTFGFFFFVXAM 

AVLIVTENYLEFGLETGFTNFSDSAMQFLE 

KQGLESQTLIJnNFlJ^LFMVLLWVKPITK 

DYIMNPPLGKESIPLMTEA1TDTLRLWLIIL 

LCALRIJVMMRSHLQAYLNLAQKCVDQM 

KKEAGRISTVELQKMVARVFYYLCV1ALQ 

YVAPLVMLLHTTLLLKTLGNHSWGYLSRI 

YLYLTSG 


2836 


A 


2 


774 


HSYSHSHGHCGSPAGDTEQGYKPVWPVCS 

LFPDGSHPGV*QPIHEPA/QGRGGLPPWGA 

A*TPRAWRLA*RPRG*AALPWA*TSPGRPA 

SAPLAHTGSGCPSRPTRAPGPSP/IPIQNIKR 

PYPGEAFVPSRAGVPTVGVTRSFHLAPSLPP 

FPSS*LSPSLPPRTTTSCTRAILTPSS*QKLLY 

PPSRPWVLLVRRARPPAAAPTSEEPPERSP 

WETPHAAPSQLHELHETHSVAQKSDLLPA 

PEAM*PGSVSSRFLLY 


2837 


A 


2 


521 


CSAAWAPKLQLLSVCRQQLPGNPRARSHS 
HHRRTRARCPSGCGQARHSAGSWHKLQFP 
LCPWKMRSPLKMRSLOCMPSESRMVVTF 
LISALESTEQYHGGVYTPCDIDSNIILSPPDI 
SNNITEGVYTPCDIDRHLIPFFLPLDMRLQV 
LMPLDSGTCTSGFPEALRPSASD 


2838 


A 


14 


1256 


WPCGAAPGLTHASERMFTLl lMIQALAPV 

MGWDRKPIJCMFSSEEMRGHLHHHHKCLT 

KILKVEGQWDIJPSCIJPLTDNTRMI^SILIN 

MLYDDLRCDPERDHFRKICEEYITGKFDPQ 

DMDKNLNAIQTVSGILQGPFDLGNQLLGL 

KGVMEMMVALCGSERETDQLVAVEALIH 

ASTKl^RATFnTNGVSLIXQIYKTTKNEKI 

KIRTLVGLCKLGSAGGTDYGLRQFAEGSTE 

KIJVKQCRKWIX^SJMSIDTRTRRWAVEGI^ 

YLTLDADVKDDFVQDVPALQAMFELAKT 

QljyCFSKQHWEEHPKDKKDFIDMRVKRL 
IX^GVISAIACMVKADSAILTDQTKELLA 
RWLALCDNPKDRGTIVAQGGGKALIPLAL 
EGTD 


2839 


A 


1913 


1582 


EDSGLRLLWICLS LS LSFP *NRVSLCHPG WS 
AVARPOLTAARPSRLQQSSHLSLQSTWDH 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










RHTPPYIALFFIFLFLVDM\SFTTVPRPVLNS 
WAOAILPFRPLKVLGLLA 1 


2840 


A 


44 


376 


MYMLLQAFWLWQETLKTILLYKFrKPPAN 
TPVLGVNAQVCHSCLAALRIRKVNGHKRN 
FKAQPPNGKLPLVLGCLCLLTDLIHALGYD 
CRRDFPVSLEYAELVFLFWAY* 


2841 


A 


522 


693 


LDFFLVFLQQFLPRPSSSEI*MLPGFPAAAY 
GPVAAAAVAAARGSGRKVYGTGDSQA 


2842 


A 


87 


439 


KTWTPQPRHPPPHPETSKPTPPC*GPVLCSC 
IJCVMPRPLPP/PP*DLCSPPLLAPGPRJRSAG 
GCWACQRRKKMSCLGGAGVCLKQGHGH 
MGLCYDLGLSTLAEPPGSSARRLPARSAL 


2843 


A 


1 


409 


MAETAVINHKKRKNSPRIVQSNDLTEAAY 
SLSRDQKRMLYLFVDQIRKSDGTLQEHDGI 
CEIHVAKYAEIFGLTSAEASKDIRQALKSFA 
GKE WFYRPEKD AGDEKG YESFP\WFIKH S 
TNTTSLSLWFFSSCTH 


2844 


A 


1 


894 


MPGPMSLWLLLLVLPLSLEHSDLRICFPGQ 

WSMESSSTGFIWTDVRAWQTSNRHVSSW 

REPRHS RMPP G AGLMERIQ AIAQNVSDIA V 

K\ODQILI^LLLHSKVSEGRRDQCEAPSDP 

KFPDCSGKVEWMRARWTSDPCYAFFGVD 

GTECSFLIYLSEVEWFCPPLPWRNQTAAQR 

APKPLPKVQAVFRSNLSHLLDLMGSGKES 

LIFMKKRTKRLTAQWALAAQRLAQKLGA 

TQRDQKQILVHIGFLTEESGDVFSPRVLKG 

GPLGEMVQWADILTALYVLGHGLRVTVSL 

KELQR 


2845 


A 


2 


1841 


'IWKNHMTTSVDGEKAFDKIQQPFMLKTL 

NKLVLEVLARAIRQEKGIKGIQLGKEEVKL 

SLFADDMIVYLENPIVSAQNLLKLISNFNK 

VSGYKINVQKSQAFVYTNNRQTESQIMSEL 

PFTIASKRIKYLGIQLTRDVKDFFKENYKPL 

LNEIKEDTNKWKKPCSWVGRINIVKN1AIL 

PKVTYRFN AIP IKLP MTFFTKLEKTTLKFI W 

NQKJRAH1AKTILSQKNKAGSIALPDFKLYW 

KATVTKTA WYWYQNRDID Q WNRIEPSEEEP 

fflYNHLIFDKPDKNKKWGKDSLFNKWCW 

ENWLAICRKLKDDPFLTPYTKINSRWIKDL 

NVRPKTIKTLEENLGNTIQAMGMGKDFMT 

ETPKAMATKAKIDKWDLIKLKSFCTAKET 

TIRVNRQPTEWEKIFTIYPSDKGUSRIYNEL 

KQINKKFCSNNPINKWAKD 

AANRHMKKCSSSLAIREMQIKTTMRYHLT 

PVRMAIIKXSGNNRCWRACGEIGTVGYKJSI 

DRQETQRTRJmiNILEDKPYGEDSrQIFLQV 

GQRKNGYARPQKSCIJCNIFQYWQKKMK 

EKTKKEKKWNLGNTR1KPEKGKENMGGT 

VLPPSSPIIWVEYEPPVSSP 


2846 


A 


60 


493 


EAGKRESSRDKGARCVYTRHGLRASIPAP 
GLRSRRGEQGCSGIRPSCGKBLVCPGCRNQ 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ENPEGNRGKGAARFTRESASGRGESRSAR 
GSIERSGDMRTYWLHS VwvLGFFLbLr 
OGLPVRSVDFNRGTDNTTVROGDTAIL 


2847 


A 


395 


3 


GGQGVTPWPSSCLPGTGSPAPSPTRLLGPT 

PRDRAEAIVGPDSATCSQTEGAQEGORCLP 

PG/MELPAGDGAGRRVGQGGPEGQLGGQQ 

RGKGAGPQPPPQEQPGLAWVGDRLIHPRL 

CLPPTCGHRAGSPGW 


2848 


A 


514 


738 


MNSLSWGAANAVLLLLLLAWASPTFISENR 
GWVMKGPSAFI^GDDMKr Air KJblsJJAt^ 
IRESSTRXXRSGSAGL 


2849 


A 


2 


427 


HVlKVlJmDWIFrPnQGP*SM/CSSKNESR 
HIGS*RVTG*LLEVLKSLL* SFGRLNALNM 
KSI/TSEVQEE*RKLNKTHRVQRDFDKDRK 
1JVVGQSESPGHPTSEKPPSTSSSAGCMLCS 
LfflSRGFQLRRKROLNGKCCPlO 


2850 


A 


3 


409 


RQEGEDSAGSWHSQGPGQCQGRAKAGSG 
P**/GPATGLGLGQ*QDQSQGKGQSSARPG 
♦GQAFQGQGQGRTRARSEAGKGQGQDRS 
RAGP*HGQGLR*GKGRARAR*GSGPRPG* 
GQGKKYGRTRGNAKAKAGPGLT 


2851 


A 


174 


446 


NTWLLP ALLLLCLSGCLS LKGPGS VTGTAG 
DSLTVWCQYESMYKGYNKYWCRGQYDT 
SCESIVETTGEEKGGKEWPRVHQRPPGGSR 

LHCDH 


2852 


A 


1008 


1246 


INNLSWQDYGESP*ALSNQTS*VVPILRPF1P 
VFLLLLFHL VFQFIQNRIQ AiTNH SI * QMFLL 
TTPQYHPLPODLPSA 


2853 


B 


428 


3792 


MSFDPNLLHNNGHNGYPNGTSAALRETGV 

IEKLLTSYGFIQCSERQARLFFHCSQYNGNL 

QDLKVGDDVEFEVSSDRRTGKPIAVKLVK1 

KQEELPEERMNGQVVCAVPHNLESKSPAA 

PGQSPTGSVCYERNGEVFYLTYTPEDVEG 

NVQLETGDK1KFVIDNNKHTGAVSARNIM 

LUCKXQARCQGWCAMKEAFGFIERGDV 

VKEIFFHYSEFKGDLETLQPGDDVEFTIKD 

RNGKEVATDVRLLPQGTV1FEDISIEHFEGT 

VTKVIPKVPSKNQNDPLPGRIKVDFVIPKEL 

PFGDKDTK5KVTLLEGDFIVRFNISTDRRDK 

I^RATNIEVLSNTFQFTNEAREMGVIAAMR 

DGFGnKCVDRDVRMFFHFSEILDGNQLHI 

ADEVEFTVWDMI^AQRNHAIRIKKLPKGT 

VSFHSHSDHRF LGTVEKEATFSNPKTTSPN 

KGKEKEAEDGIIAYDDCGVKLT1AFQAKD 

VFn^rrSPOIGDKVEFSISDKORPGQQVATC 

VRLLGRNSNSKRLLGYVATLKDNFGFIETA 

NHDKEEFFH YSEFSGD VD SLELGDMVEY SL 

SKGKGNKVSAEKVNKTHSVNGITEEADPTT 

YSGKX^IJRSVDPTQTEYQGMIEIVEEGD 

MKGEVYPFGIVGMANKGDCLQKGESVKF 

OLCVLGQNAOTMAYMTPLRRATVECVKI) 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *-Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










QFGFTNTVEVGDSKKLFFHVKEVQDGIELQA 
GDEVEFSVTLNQRTGKCSACNVWRVCEGP 
KAV AAPI^DRLVKEOXNITLDD AS APRLM 
VLRQPRGPDNSMGFGAERKIRQAGVIDXN 
WRKQKCF VFTKING LFTQRSKP QTTRGKIK 
PPSPTSPELTLVTLDKAFSPLARDPVYGQFK 
KRAKKSDPSIPVI 


2854 


A 


I 


747 


MRLQRPRQAPAGGRRAPRGGRGSPYRPDP 

GRGARRLRRFQKGGEGAPRADPPWAPLGT 

MALLALLLWALPRVWTDANLTARQRDP 

EDSQRTDEGDNRVWCHVCERENTFECQNP 

RRCKWTEPYCVTAAVKEFPRFFMVAKQCS 

AGCAAMERPKPEEKRFLLEEPMPFFYLKC 

CKIRYCNUGGAO^THQXCSKmiLGAWV 

RAVVGCGWPSSCCWPPLQPASACLEPRDC 

HRLSLPEHGLAPDRCHLLH 


2855 


A 


3 


1018 


FASFPSINLQQMLKEVPKRFGDERGATVHY 

TILNNHVYRRSLGKYTDFKMFSDEILLSLT 

RKVLLPDLEFYVNLGDWPLEHRKVNGTPS 

PIPnSWCGSLDSRDVVLPTYDITHSMLEAM 

RGVTNDLLSIQGNTGPSWINKTERAFFRGR 

DSREERLQLVQLSKENPQLLDA/WNYRIFL 

FPRERKGA\*KAKXMGLLDTCT*RNVDGTV 

AAYRYPYLMLGDSLVLKQDSPYYEHFYM 

ALEP WKJTYVPIKRNLSD LLEKVKWAKEN 

DEEAKKIAKEGQLMARDLLQPHRLYCYYY 

Q VLQKY AERQSSKPEVRDGMEL VPQPED S 

TAICQCHRKKPSREEL 


2856 


A 


3 


3707 


RAGEWPGWLLAAAAAHPGRPAASLSPGL 

GAVLGVAGRQVADPRFRRDWFRIPSPPAE 

SAGPARQAGFAAAPPARAGPALSTMKGTR 

AIGSVPERSPAGVDLSLTGLPPPVSRRPGSA 

ATTKPIVRSVSVVTGSEQKRKVLEATGPGG 

SQ AINNLRRSNSTTQVSQPRSG SPRPTEPTD 

FLMLFEGSPSGKKRPASLSTAPSEKGATWN 

VLDD QPRGFTLP SNARSS S ALD SP AGPRRK 

ECIVALAPNFTANNRSNKGAVGNCVTTM 

VHNRYTPSERAPPLKSSNQTAPSLNNIIBCAA 

TCEGSESSGFGKLPKNVSSATHSARNNTGG 

STGLPRRKEVTEEEAERFfflQVNQAAVTIQ 

RWYRHQVQRRGAGAARLEHLLQAKREEQ 

RQRSGEGTLLDLHQQKEAARRKAREEKAR 

QARRAAIQELQQKRALRAQKASTAERGPP 

ENPRETRVPGMRQPAQELSPTPGGTAHQA 

LKANNAGGGLPAAGPGDKCLr 1 bJJb^rJar 

QQPPEDRTQDVXLAQDAAGDNLEMMAPSR 

GSAKSRGPLEELLHTLQLLEKEPDALPRPR 

THHRGRYAWASEVTTEDDASSLTADNLEK 

FGKLSAFPEPPEDGTLLSEAKLQSIMSFLDE 

MEKSGQDQLDSQQEGWVPEAGPGPLELGS 

EVSTSVMRLKLEVEEKKQAMLLLQRALAQ 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=TJnknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










QRDLTARRVKETEKALSRQLQRQKEA\YE 

ATIQRHLAEEDQLIEDKKVLSEKCEAWAE 

LKQEDQRCTERVAQAQAQHELEIKKLKEL 

MSATEKARREKWISEKTKKIKEVTVRGLEP 

EIQKLIARHKQEVRRLKSLHEAELLQSDER 

ASQRCLRQAEELREQLEREKEALGQQERE 

RARQRFQQHLEQEQWALQQQRQRLYSEV 

AEERERLGQQAARQRAELEELRQQLEESSS 

ALTRALRAEFEKGREEQERRHQMELNTLK 

QQLELERQAWEAGRTRKEEAWLLNREQE 

LREEIRKGRDKJE1ELVIHRLEADMALAKEE 

SEKAAESRIKRLRDKYE AELSELEQ SERKL 

QERCSELKGQLGEAEGENLRLQGLVRQKE 

RALEDAQAVNEQLSSERSNLAQVIRQEFED 

RVLAASEEETRQAKAELATLQARQQLELEE 

VHRRVKTALARKEEAVSSLRTQHKGSWK 

RADHLEELLKQHRRPTPSTKCPGMPGTLFK 

NGRQRTKAGRGPRGPQGRPPAPHRGWWL 

RCPRI^TCGCILTVKEAVWSKKKKKGAPF 


2857 


A 


1 


2064 


MTASIRRYHTCATDGEPDSSVLVGGDGDL 

TLLVAALGLDLGLPFMLLPPLMEWMRVAI 

TYAEHRRSLTVDSGDIRQAARLLLP/GPEH 

CFSSFR\RLDARAATEKFNQDLGFRMLNCG 

RTDLDSTQAIEALGPDGVNTMDDQGMTPLM 

YACAAGDEAMVQMLIDAGANLDIQVPSNS 

PRHPSIHPDSRHWTSLTFAVLHGfflSWQL 

LLDAGAHVEGSAVNGGEDSYAETPLQLAS 

AAGNYELVSLLLSRGADPLLSMLEAHGMG 

SSLHEDMNCFSHSAAHGHRGIWGLVTLGP 

LACLEEEDHETPSPRVPQSSPSGQEGTGGQ 

LRNVLRKLLTQPQQAKADVLSLEEILAEGV 

EESDASSQGSGSEGPVRLSRTRTKALQEAM 

YYS AEHGY VD ITMELRALGVP WKLHIWIE 

SLRTSFSQSRYSWQSLLRDFSSIREEEYNE 

ELWEGLQ]JvIFDILKTSKNDSVIQQLATIFT 

HCYGSSPIPSIPEIRKTLPARLDPHFLNNKE 

MSDVTFLVEGKLFYAHKVLLVTASNRFKT 

LMTNKSEQD GDSSKTIEISDMKYHIFQMM 

MQYLYYGGTESMEIPTTDILELLSAASLFQ 

LDALQRHCEILCSQTLSMESAVNTYKYAKI 

HNAPELALFCEGFFLKJHMKALLEQMMPSGS 

SSTAAAAKCRAWIHCRTCRTPWQSACTLS 

TSPPGSAA 


2858 


A 


1 


571 


FRPGRRAKRAMAVYVGMLRLGRLCAGSS 

GVLGARAALSRSWQEARLQGVRFLSSREV 

DRMVSTPIGGLSYVQGCTKKHLNSKTVGQ 

CLETTAQRVPEREALWLHEDVRLTFAQL 

KEEVDKAASGLLSIGLCKGDRLGMWGPNS 

YAWVLMQLATAQAGIILVSVNPAYQAME 

LEYVLKKVGCKALVFPKQ 


2859 


A 


2737 


2600 


MCCWIWFASILLRIFALMFIRDIGLKFSFFV 
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NO: 


IVTpthnfl 

ITJlVUlvU 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










VSLPGFGBRMMLAS* 


2860 


A 


1 


1353 ! 


MVKLSIVLTPQFLSHDQGQLTKELQQHVK 

SVTCPCEYLRKVSLLKTIFWSRNGHDGSTD 

VQQRAWRSNRRRQEGLRSICMHTKKRVSS 

FRGNKJGLKDVTTLRRHVETKVRAKIRKRK 

VTmNHHDKINGKRKTARKHTGDCHPGE 

WGQAHFVPDSPVHIALHGMAQPLFGIQG 

GALEPAGRGTGFLDSPVFRPIRKYNVQIPPS 

ARKALCNWSLLLVCVGKPEEFVAIHYYTPN 

TKLVPLARPRNSHWHPPERTTVTQYSTCA 

LLTALCLLLPVLQETAQSRRMVTSHPEDSP 

ALARKHGASQPAGLGFPRTQTVTPAFTFQT 

PTAAEPALLSAWLGRAPETETTTDMAGSA 

AAAPTCEMLRAHGHDDLYFKWEPCASSQ 

ATTVLPKHSGTGGSRQGPAVAHPAAPFPKV 

RGGEGTYYLHLSVFSDLVDLHLLHVGQRV 

VQGLRLRL 


2861 


A 


1553 


1896 | 


CSSFCFPFPRSRPTAPRPDHRPAEPQRLHSA 
EGAPEWGPTSDPHHHPCPGGAPGGTQDP 
KMAAEAPQQPNSDWAGEISMCRGSTHQL 
QMAFSETFLSALSGSSRGRPAGKESC 


2862 


A 


262 


129 


SGLFLFFFPFPPFLPLPLCKHQIRDEWGNQI 
WICPGOSrKPDDGSPMIGCDDCDDWYHWP 
CVGIMTAPPEEMQWFCPKCANKKKDKKH 
KKRKHRAH *RDDYKMLFMTYKRKLRIFV 
RNALSLNT 


2863 


A 


3 


520 


LVDPRVRAVFLQLLPLLLSRAQGNPGASLD 

GRPGDRVNLSCGGVSHPIRWVWAPSFPAC 

KGLSKGRRPILWASSSGTPTVPPLQPFVGR 

LRSLDSGIRRLELLLSAGDSGTFFCKGRHE 

DESRTVLHVLGDRTYCKAPGPTHGSVYPQ 

LLIPLLGAGLVLGLGALGLVWWLH 


2864 


A 


1 


553 


RTRGRTRGLVIKKWASHHQINDASRGTLSS 

YSLV04VLiT^LQTLPEPILPSLQKTYPESFS 

PMQLHLVHQAPCMVPPYLSKNESNLGDLL 

LGFLKYYATEFDWNSQMISVREAKAIPRPD 

GIEWRNKYICVEEPFDGTNTARAVHEKQK 

FDMIKDQFIJKISWHRLKNKRDLNSILPVRA 

AVLKR 


2865 


A 


516 


848 


MWSLWrWVDQHQARLIPSPQVLLLLLRET 
PSTAAAVAGWLWASMALLQLHAVGGVA 
LTSSHPFMWATGEELRKPPWQGSAGSASG 
VEELTGKHSCPGPEEPATVQKAPA* 


2866 


A 


349 


1018 


TFTQPDPDDLISKPPRTPGGG*YQTQWPSPP 
DPRRTSPAGRPGPARRPPRRTPRPARGRHP 
GR* GGPGASRPGGTG AAPAADQTGSP AVS 
TPSEFGAPGQAEGPQSPIRASARSHLSCTA 
WLGKPSKPSAQRQPTVGPDGDRDGSSQAP 
NLSRGQAWRASLASPQNTSATGRVTCHGQ 
STWPLCRLKSNRRRKSGF A/ GNKSEP VGLT 
RRSKHQPRNPOGQVGI 
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Table 8 




SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon,/=possible nucleotide 
deletion,=possible nucleotide insertion) 


2867 


A 


117 


560 


MYWSLLL(XFFKKSDPDPGPFQNNLFHNH 

GTQSQSCMGSKVGDVTPGAARLISETAQRV 

HTIGQKQKNDQHLRRVQALLSGRQAKGLT 

SGRWFLRQGWL1WPTHGEPRPRMFFLFT 

DVLLMAKPRPTLHLLRSGTFACKALYPMA 

0 


2868 


A 


438 


2 


TQRLV1SEPDGEILTPGWDTQDRMGVESRT 
NIQELGNRNQREAGGENLPETQAHMGETQ 
DQLRCKIDAETQTPEWENQDKNGSEDAVE 
TQTFEKKDKKEAGEEDGEEIQAQGLGKQG 
OTGDENGEETQTRVLRALETIPASS 


2869 


B 


1 


390 


MTPKHDHLGHVLPISLQLLLELSSCLPAAS 
A VWCAG CNDPWMTG YPDNMH YNYKPML 
HDRGGSAVTLSASQSWYAGCNAEKSEVN 
AFPGTQGMRFISAASYKDWVQVLQQKDV 
SRNMGTKARSASSLKN 


2870 


A 


1 


3411 


MMEGEGGVRMSHDQTGNKRKHGTSGISV 

CPNLLLLQEYQPDYIRAHASGLNLISSSKAL 

PKYSHVLSGLCKICSFGPRFSLHSDTFFFAL 

FAHADPEQIRNCETPAPPLQTERKNEMRIK 

THPSSSPLYDTPGRPAGSDDSSSRGRAGAL 

stflepqrprthlslelhrpspgprlslplft 

kpsflgsgrrehaeerargpretaavaar 

aeqgrggshshssalgaprrvamlpglal 

lllaawtaraleslenrsaaggcrkemn 

kgndngalaiggnmvhwvddfgwyvdr 

dtleqgsptpshgqvlvhgllgtgphsrst 

lnikeqlprskissigacniifqvdinaifgil 

mvptdgnagllaepqiamfcgrlnmhmn 

vqngkwdsdpsgtktcidtkegilqycqe 

vypelqitnvveanqpvtiqnwckrgrkq 

ckthphfvipyrclvgefvsdallvpdkck 

flhqermdvcethlhwhtvaketcsekst 

nlhdygmllpcgedkfrgvefvccplaees 

dnvdsadaeeddsd vwwggadtd yad g 

rtsaifgydhdckvhdafalssvlvdrqe 

wgstyesgagqgiaafwgacwkeeqsll 

fllpdmdwix:lhsninfnyisqnshmlwr 

dpgeidskklsalsslpgivlalgkaqrilli 

ellgvglesedkwevaeeeevaeveeee 

adddeddedgdeveeeaeepyeeatertt 

sia1t it rites veewrevcseqaetgpcr 

amisrwyfdvtegkcapffyggcggnrn 

nfdteeycmavcgsatnctfdlkkswssg 

gqiqmadsiqrkgaelea1cqkrfsqrkhr 

YGKCFV GVX.APVMEEHFV1GTLGAASPFM 

NKIXANLCYFTPENRALAVPTTAASTPDA 

VDKYLETPGDENEHAHFQKAKERLEAKHR 

ERMSQVMREWEEAERQAKNLPKADKKAV 

IQHFQEKVESLEQEAANERQQLVETHMAR 

VEAMLNDRRRLALENYITALQAVPPRVGL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=$top 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










AAAEFTLQVTAQTPRHWNMLKKYVRAE 
QKDRQHTLKHFEHVRMVDPKKAAQIRSQ 
VMTHLRVIYERMNQ SLSLLYNVP AV AEEI 
ODEVAFKINKNMNYYKPDAGKISG 


2871 


A 


18 


382 


GKMPPHLAMGCPPRLNPWEQPELGARGR 
GDGCPCPAEHGWALDVRYS *LPLPQSLASS 
LAIPPQVFCSFTLSSFCSPRPAARQETPAGAP 
PAGPSFAGRRRnPGSGAPRRSPGGRRQEQ 
LR 


2872 


A 


673 


941 


CCLAAHSGPPAQGQRRGPG*LCCSAGSGG 
NL*S*AGGPG*GRSGQPVCPPWPGPGAPGH 
RPALPGSGGSSAVGRSAVPGAVRSPSHAG 
W 


2873 


A 


227 


712 


ALLESLSSGEAQAWGAPRLVAGERLIEHKC 

VLGGGTAGAWG*KDQVTIQPAGHAPGLSG 

TEATVTPDDSVSDPTTWPSQEVSMCHPLPG 

SHPSHLLKEGMTSVRPRALQQGPPWQLQT 

KDSAPPP*TPASFSPFFPLSPLPVSPSLSHTH 

SFRVQGAKRFA 


2874 


A 


1942 


932 


ARVRWRPPRWPPRASCPGPALRLCRGGSM 

GGPRGAGWVAAGLLLGAGACYCIYRLTR 

GRRRGDRELGIRSSKSAEDLTDGSYDDVL 

NAEQLQKLLYLLESTEDPVIIERALITLGNN 

AAFSVNQAIIRELGGPIVANKINHSNQSIKE 

KALNALNNLSVNVENQIKIK1YISQVCEDV 

FSGPLNSAVQLAGLTLLTNMTVTNDHQHM 

LHSOTDLFQVLLTGNGOT1CVQVLKLLLNL 

SENPAMTEGLLRAQVDSSFLSLYDSHVAK 

EILLRVLTLFQNIKNCXKIEGHLAVQPTFTE 

GSLFFLLHGEECAQKIRALVDHHDAEVKE 

KWTUPKI 


2875 


C 


1 


531 


MARNECVDGQPGHLVDFTCLVTYRVSGES 

RAPHPMAELFLVIYHMEEKLETHIPRKQER 

VEEKGPCICKALSPNSVNQRDAREKEMLQ 

QLQNRDTKQVLPSKASAHTPLDKAHHTAK 

PDGSGGEKDFLHTRTTPPPLLQGRAGNIFN 

NKTVYRSNTIITIGRWVLRAIELRPKDNN 


2876 


A 


1573 


2858 


EPVFEQAIDQRSSTDTSLSTPAAPMVDSLIA 

RVGVMARGNAITLPVCGRDVKJFTLEVLRG 

DSVEKTSRVWSGNERDQELLTEDALDDLIP 

SFLLTGQQTPAFGRRVSGVTEIAEK3SRRRK 

AAALTESDYRVLVGELDDEQMAALSRLG 

NDYRPTSAYERGQRYASRLQNEFAGNISA 

LADAENISQ*ICWKYFCAG*CGKYF\RKTIT 

RCINTAKLPKSWALFSHPGELSARSGDAL 

QKAFTDKEELLKQQASNLHEQKKAGVISP 

PEEVTTLLTSEIKTSSASRTSLSSRHQFAPGA 

TVLYKGDKMFITVKIAKRSQAPCMKSNNA 

UVILGWTLDAVGIGLVMPVLPGLLRDIVH 

SDSIASHYGVLLALYALMQFLCAPVLGALS 

DRFGRRPVLLASLLGATIDYAIMATTPVLW 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










IYPLVNSPSC 


2877 


B ! 


448 


3506 


XALMIEIDGGESWSFMDDNQNKTHDKKE 

KKMVVQKPHGTMEYTAGNQDTLNSIALK 

FNITPNKLVELNKLFTHTIVPGQVLFVPDA 

NSPSSTLRLSSSSPGATVSPSSSDAEYDKLP 

DADLARKALKPIERVLSSTSEEDEPGWKF 

LKMNCRYFTDGKGWGGVMIVTPNNIMF 

DPHKSDPLVIENGCEEYGLICPMEEWSIAL 

YNDISHMKIKDALPSPGEWEDLASEKDINP 

FSKFKSINKEKRQQNGEKIMTSDSRPIVPLE 

KSTGHTPTKPSGSSVSEKLKKLDSSRETSH 

GSPTVTKLSKEPSDTSSAFESTAKENFLGED 

DDFVDLEELSSQTGGGMHKKDTLKECLSL 

DPEERKKAESQINNSAVEMQVQSALAFLG 

TENDVELKGALDLETCEKQDIMPEVDKQS 

GSPESRVENTLMHEDLDKVKUEYYLTKN 

KEGPQVSEMLQKTELSDGKSIEPGGEDITLS 

SSI^QAGDPITEGNKEPDKTWVKKGEPLPV 

KLNSSTEANVIKEALDSSLESTLDNSCQGA 

QMDNKSEVQLWLLKRIQVPIEDILPSKEEK 

SKTPPMFLCIKVGKPMRKSFATHTAAMVQ 

QYGKRRKQPEYWFAVPRERVDHLYTFFV 

QWSPDVYGKDAKEQGFVVVEKEELNMID 

NFFSEPTTKSWEIITVEEAKRRKSTCSYYED 

EDEEVLPVLRPHSALLENMHIEQLARRLPC 

KGYPWRLAYSTLEHGTSLKTLYRKSASLD 

SPVLLVIKDMDNQIFGAYATHPFKFSDHYY 

GTGETFLYTFSPHFKVFKWSGENSYFINGD 

ISSLELGGGGGRFGLWLDADLYHGRSNSC 

STFNNDILSKKEDFIVODLEVWAFD 1 


2878 


A 


226 


2263 


SVKNYTKCHVI^EQI(>JKLTS(^CSI^L 

NCQWDQRQQECQALPAHLCGEGWSHIGD 

ACLRVNSSRENYDNAKLYCYNLSGNLASL 

TTSKEVEFVLDEIQKYTQQKVSPWVGLRKI 

NISYWGWEDMSPFTNTTLQWLPGEPNDSG 

FCAYLERAAVAGLKANPCTSMANGLVCE 

KPWSPNQNARPCKKPCSLRTSCSNCTSNG 

MECMWCSSTKRCVDSNAYUSFPYGQCLE 

WQTATCSPQNCSGLRTCGQCLEQPGCGW 

CNDPSNTGRGHCIEGSSRGPMKLIGMHHN 

EMVLDTNLCPKEKNYEWSFIQCPACQCNG 

HSTCINNNVCEQCKhfLTTGKQCQDCMPGY 

YGDPTN GGQCTACTCSGHANICHLHTGKC 

FCTTKGIKGDQCQLCDSENRYVG1SIPLRGT 

PEQSNKNLDISINASNNFNLNTTWSVGSTA 

GTISGEETSIVSKNNIKEYRDSFSYEKFNFR 

SNPMTFYVYVSNFSWPIKIQIAFSQHNTIM 

DLVQFF\nTFFSCFLSLLLVAAWWKIKQTC 

WASRRREQLLRERQQMASRPFASVDVALE 

VGAEQTEFLRGPLEGAPKPIAIEPCAGNRA 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /^possible nucleotide 
de!etion,=possible nucleotide insertion) 










AVLTVFLCLPRGSSGAPPPGQSGLAIASALI 
D1SQQKASDSKX)KTSGVRNRKHLSTRQGT 
CV 


2879 


A 


1 


1131 


MKVTFANKPEGGGRLAKQRPPGRGARPRP 

KHEGGQSVLGTRRPALLQVSCTDVSLSEQ 

DKDGATATHFAASRGHSKVLSWLLLHGG 

EIS ADLWGGTAL YD AAEN GELGCCQIL W 

NG AELE VRDRD GY AA AD LSD FNGH SHCT 

HCLRTVENLHRGMVLALGAAEHSKAQRP 

EAAGGPEGELPPEKESLEENEWPSRGQOLV 

PSAPTAVAQSMEHCVLSRDPSVELEAKQP 

DSGMSSPNTTVSVQPLNFDLSSPTSTLSNY 

DSCSSSHSSEKGQHPPRAPNPQ1LQYKKRFS 

ELEQLLERSGELEQQQLRDAEHSQDLESAL 

IWLEEEQQGGPGLAAWPPGRAPTDPLCPig 

ECQPGPGECHALRTAGPGRFGQPGSE 


2880 


A 


1 


416 


FRTDARVAITIYYQATEEFQNGIASYIPKDN 
SLQSETVQYKRGVCQQFCLPSHTVDPSEW . 
AEEELGFDLDREVYPLWHAWDEGDEYF 
GHCHVLLGTFEKHTDGTFC VKPLKQKQV V 
DGVSYLLQEIYGEENKYNTQ 


2881 


A 


419 


1 


KYFKC APFPPATRPKAHTVFLKNVDIQVNL 
RFCSKVAKLHYPNNLLFHSLGITKMQLDR 
KELAWQSHSGSKGRILFSPSLPALEQLRVP 
LEEHSASPDPIHPPSLAPERAASPGPPTGAE 
TRVP APHAGTDP SEPPRR 


2882 


A 


2 


366 


ARPRWLKRLGSQRELAQLGPEHLQAGHR 
P APLRP AAGHAPDRVRAPQRRRAS AHARG 
SGGLVGPGALPLAAPSRPPGAPLRGDQGL 
GQLPASQPQGLGAHAAAADPGLQPRAAG 
ATEFSV 


2883 


A 


3 


1396 


RQENNTRG VP S LLKSFLQERLGIHLIRRKIV 

KPKHHVLMSRKESWKVKSEIPKVPKQPLV 

LHHPRMTTTKSPSKDMLEPEAELAEDLPTT 

KSTSVES/EDAH*EPGRPFPVLPDUPCHCLP 

SAPTPLCIVKRPCPT*VTQLSASAQSAHQM 

RTPRAQSPSS*PR*VNCLPPS/LHKDDLELK 

EKDQKKPPTAPREVKGTRRKLPTAFLPSKY 

HGYEELLTAKPDPAFEEPKGIQKNA/PSPAT 

NAEAPTPVPLLQAQAGHSSETLCSQRETGP 

EWDSTPKED*SPTSG*HLHSLAObFbJtl Y Ku 

STRCCPAPVDRTAAGEP/ASSTWRPRGC*R 

SSRHVTGSW*VALCAQCSGLPRSPWPAQR 

*VRASPSSATSSSSWMSSARSPQPVTHKAR 

AVHGGCVHHP ACAP ALPEGS VPWTAPQG* 

PAGHRPQSSAGPHLLATRWHPLVRISPPWP 

RHDLVPGPAAIKSGCTGQ 


2884 


A 


437 


748 


MUGLLAWLQTVPAHGCQFLPITSVTATVY 
HLPVHQLKGRSRVQKNLTLDNEGEGTWTT 
CLEFLESLAGWRLGWGVSRGVREWLCLQ 
QVSLHQTPGLPHKQDL* 
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Table 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 


2885 


A 


1696 


2394 


ERSTYDLRSSDRPAQETSHQFQIHLPCVLLL 

YSPTLTLKYISTPSLATDHAPLTISLKPNHP 

YPAQCQYPIPQHALKGLKPAITRLLQHGLL 

KPINSPYNSPILPVLEPEKIYRLVQDLRLINQ 

IVLPIHPWPNPYTLLSSIPPSTIHYSVLDLK 

RAFFTIPLYPSSQPLFAFTWTOPDTLQAQQI 

TWAVLPQSFrDSPHYFSQAQISSLSVTYLbl 

IL1KTHTLSLLIMSD 


2886 


A 


377 


3 


TPAWMTERDCIWRRRTSAPGGSWPSGPVP 

SPGAQ*RPPSQGLGLWWAAAAAPRC*TAP 

GPRPPPHGPGSPQGASPPTRPPRCRPHPRA 

GSAGPTGATPPGSTQGQRRJRHSHQLPGHP 

GHRVALG 


2887 


A 


1162 


536 


HILRRQEFFFFCLFVCLRWVLVLLPRLE*CG 

MILAHCNLFLLGSSNSPASAS*VAGTTGVR 

HHA WIIFCIL VETEFHRVAQTDLELLS SGNP 

PASAS*SAGIIGVSHSAWPESCRYARRKCF 

CVKKLRRWKIJ^JPLCIQKAVSEGHCWQASP 

YRDSAVREQSIWGTTASSGGARMRWSSPA 

ALYVRLLAGFSFINKLVASEYRVFSSTL 


2888 


A 


128 


2626 


NSHRWVYVRARRWRRRGKQREQPEDRGV 

PMKRAAMALHSPQYIFGDFSPDEFNQFFVT 

PRSSVELPPYSGTVLCGTQAVDKLPDGQEY 

QRIEFGVDEVIEPSDTLPRTPSYSISSTLNPQ 

APEFILGCTASKJTPDGITKEASYGSEDCQYP 

GSALALDGSSNVEAEVLENDGVSGGLGQR 

ERKKKKKRPPGYYSYLKDGGDDSISTEAL 

VNGHANSAVPNSVSAEDAEFMGDMPPSVT 

PRTCNSPQNSTDSVSDIVPDSPFPGALGSDT 

RTAGQPEGGPGADFGQSCFPAEAGRDTLS 

RTAGAQPCV GTDTTENLGVANGQILESSG 

EGTATNGVELHTTESIDLDPTKPESASPPAD 

GTGSASGTLPVSQPKSWASLFHDSKPSSSS 

PVAYVETKYSPPAISPLVSEKQVEVKEGLV 

PVSEDPVAJKIAELLENVTLIHKPVSLQPRG 

LINKGNWCYINATLQALVACTPMYHLMKF 

EPLYSKVQRPCTSTPMIDSFVRLMNEFTNM 

PVPPKPRQALGDBCIVRDIRPGAAFEPTYIYR 

LLTVNKSSLSEKGRQEDAEEYLGFILNGLH 

EEMU^LKKLI^PSNEKLTISNGPKNHSVNE 

EEQEEQGEGSEDEWEQVGPRNKTSVTRQA 

DFVQTPITGIFGGHIRSWYQQSSKESATLQ 

PFFTLQLDIQSDKIRTVQDALESLVARESVQ 

GYTTKTKQEVE1SRRVTLEKLPPVLVLHLK 

p F WVK TOTirOKLIKNIEYP VDLEISKELLS 

PGVKNKNFKCHRTYRLFAVVYHHGNSAT 

GGHYTTDWQIGI^GWLRJDDQTVKVINQ 

YQWKPTAERTAYLLYYRRVDLL 


2889 


A 


1669 


1338 


FRRPRRANRFRSRrRNQPGPHGETPFFL*IP 
KLARHGGG/CP *SPLLRRVRPENPFNPGSRG 
FN*LKPQPCPPTWVTE*DSVSKTNKOPPPT 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

cllUlUg 

nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^Un known, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 

KKNRDGRWGAIWESQMETWS J 


2890 


A 


807 


369 


GKGGGGQTRRCARPGRHHAAPALRADRT 
GP APRRGLFGRCRTLQPSARRLSSEHS V*Q 
THGCATPSRCHGGDGREDRGSPGDRGERP 
AGPAGGAGLEPAPGTLQPRSRPSRRWLLSP 
GAGAOOLEWHLPGQRPONOPCPLDFLP 


2891 


A 


1204 


2 


FFFPVPPPLFTDPRAPQPHRHLAFRGHRKE 

KGPGDPPSTPQSQ\ADPAAAPQGQPGC/RLP 

RGHCDRRHQEARPGCWGPPVGGPGSILGPK 

SWCHLEADSGKRPGWTVGVGVRSSPACP 

GH/VEQQGSAGSPGWMGWGCPCPVS*PLQ 

GQNQPSPSSLGGSRGSFFSPPDPA/GGQGQE 

GEGRGERSGQGPWGPGSFKNA/RQVAGGG 

QEGGQGPDPHDGGSLRPPRMKEGGLGRRG 

RPQPSVTPVLGSAARWSKAPPSQGQDHRT 

GGNRHLAP *SSGGRGGAPGALGL/P WHP A 

CSGASGHSGRWA*RSSGWG*GPSPHTPPPG 

P ARHP APGLAGLAPHP ARLRK*SGRSPR/E 

AGVKISLLLGGERGI/PGPLAWHDSGDGG 

AGHRGGV*S*RS\PPDPLSLSPRPAA 


2892 


B 


74 


325 


SAFSYIPPRRLDPTEHSYYYRPAREQERPA 

GVLTSSVYGKRINQPIEPLNRDFGRANHVQ 

ADFYRKNDIPSLKEPGFGHIAPS 


2893 


A 


1 


3426 


MAGGQEVEAWADQLCAKYSKEYGKLCR 

TNQIGTVNDRLMHKLSVEAPPKILVERYLI 

EIAKNYNVPYEPDSWMVEDILEMSLVEFG 

NIGEAFLEQNQSPESSVTLTSANATLLLSRQ 

NISTLPLSSYTLGHPAPVRLGFPSALALKEL 

LNKHPGVNVQVFALDPVLGTFLELTSVILM 

VLWINLFVSAILMAFGKERKSLKWMQS 

NTICYRENRISTVPPSGTRETARKAKGHRG 

LPENPVQLSEAFNCQDKLCNWIPVGQCPA 

ARSTVYANERAQLPGTVTMASRVIFPLPLA 

FESLHTPGKSSSQGSDAGAGPPILGLFCPW 

TRGPRI^ALRARRLSSPIADVNKNIPPSKHR 

Tn^SRPDGSD^LPPFFVVTITPPARADVQE 

l^GHTIEQDEGERQHQIEKTEEENTNKPKR 

KQKLAPGTPQSNMKPVHERSQECLPPKKR 

DLPVTSEDMGRTTSCSTNHTPSSDASEWSR 

GVWAGQSQAGARVSLGGDGAEAITGLTV 

DQYGMLYKVAVPPATFSPTGLPSWNMSP 

LPPTFNVASSUQHPGIHYPPLHYAQLPSTS 

LQFIGSPYSLPYAVPPNFLPSPLLSPSANLAT 

SHLPHFVPYASLLAEGATPPPQAPSPAHSF 

\n<r ap«3 AT^P^noT PTTHSSTOPLDLAPGRMP 

rYYQMSRLPAGYTLHETPPAGASPVLTPQE 

SQSALEAAAANGGQRPRERNLVRRESEAL 

DSPNSKGEGQGLVPWECWDGQLFSGSQ 

TPRVEVAAPAHRGTPDTDLEVQRWASQV 

GPQSTILRTQCLCnSHLTFFASDFFVVPRTV 

NV^DTIKLFLRVTNNPVGVSLLASLLGFYV 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ITVVWARKKDQADMQKGCQTPAGVHPPA 

PQLEEAGTIPSGGLVKVTVLADNDPSAQFH 

YLIQVYTGYRRSAATTAKLSVYLILPGCRT 

RTRDPLSGVGSRPVAGAEYRLPGQFGRTST 

VAASNTQAEGAAGHRGFWLAKQHPKDAV 

TLELRCTPCRSIARLSDAGGVPAGARRVRC 

AAVLANCSLDMKRGVCASRSATVRKRSD 

KDVEELGDRESAVGVSDFLDGDAHYERN 

GNNSHLYQRHKKTKRGVAIARDKMPPDF 

QDHVIPGQEIKAKSFYSPVDSDETGDKIRY 

NSKRRHWRTGMLGL 


2894 


A 


3 


30 


ENFQHFMDRISNGGLEEGKPVDLVLSCVD 

NFEARMTINTACNELGQTWMESGVSENAV 

SGHIQLIIPGESACFACAPPLWAANIDEKT 

LKREGVCAASLPTTMGVVAGILVQNVLKF 

LLNFGTVSFYLGYNAMQDFFPTMSMKPNP 

QCDDRNCRKQQEEYKKKVAALPKQEVIQE 

EEEIIHEDNEWGIELVSEVSEEELKNFSGPV 

PDLPEGITVAYTIPKKQEDSVTELTVEDSGE 

SLEDLMAKMKNM*ISWEE 


2895 


A 


1 


2369 


AGGARLRPARGRPPRLLPPRPGPCRPPPVP 

APTVNERRAPPRAGWERRSDAGLSRGARP 

AEMYGVCGCYGALRPRYKRLVDNIFPEDP 

EDGLVKTNMEKLTFYALSAPEKLDRIGAY 

I^ERIJRDVGRHRYGYVCIAMEALDQLLM 

ACHCQSINLFVESFLKMVAKLLESEKPNLQ 

ILGTNSFVKFANIEEDTPSYHRSYDFFVSRF 

SEMCHSSHDDLEDCTKIRMSGIKGLQGWR 

KTVNDELQANIWDPQHMDKIVPSLLFNLQ 

HVEEAESRSPSPLQAPEKEKESPAELAERC 

LRELLGRAAFGNIKNAIKPVLIIILDNHSLW 

EPKVFAJRCFKIIMYSIQPQHSHLVIQQLLG 

HLDANSRSAATVRAGIVEVLSEAAV1AATG 

SVGPTVLEMFNVTLLRQLRLSIDYALTGSY 

DGAVSLGTKJ3KEHEERMFQEAVDCTVGSF 

ASTUTYQRSEVILFIMSKVPRPSLHQAVDT 

GRTGENKNRLTQIMLLKSLLQVSTGFQCN 

NMMSALPSNFLDRLLSTALMEDAEIRLFVL 

EILISFIDRHGNRHKFSTISTLSDISVLKLKV 

DKCSRQDTVFMKKHSQQLYRHIYLSCKEE 

TNVQKHYEALYGLLALISIELANEEVVVDL 

IRLVLAVQDVAQVNEENLPVYNRCALYAL 

GAAYLNIJSQLTTWAFCQHIHEVIETRKKE 

APYMLPEDWVERPRLSQNLDGVVIELLFR 

QSKISEVLGGSGYNSDRLCLPYIPQLTDED 

RLSKRRSIGETISLQVEVESRNSPEKEEVSV 

RATVLGQPHLL 


2896 


A 


1575 


1968 


REMGFRHVGQTGLELLTSGDLPTSASQSA 
GITGVSHHTWPKTLFVLRQSLTLSPGLECS 
GTISAHCSPHLPCSSNSCAPASRVAESTEAH 
H/LCPDNLHISSREGASPCWPGCS *TPELKR 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted j 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X-Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PAHPCRDQLGH 


2897 


A 


524 


954 


FCSMSSQKWSWQAQPLSWRHWSQGPVPS 

LPAKLLFKGFLPGTAKPACSAFREAAALAF 

IQDNKTAISEEKGNGSRFLGFPSARLRGRPR 

AESPRPEPRARPRATQPGPAAPAAHATPPP 

GPAPAPYLVTRGASGGRGNVRGPK 


2898 


A 


188 


590 


DLHFEIQVLLEALRGLCSLYPKHREGSLKV 

HPGHIXTWMPTVTRPGTPPSQASTGAQELP 

GGEKKTCRWEKKKKTFPGSAGLTGKS1ER 

LTRPALYLRPLXFSSFPVRVTLEALPGGVPK 

RSASRMPVEMKRGPF 


2899 


A 


41 


274 


KRGTERKTHFGGCSIQFSDIASGKNILPGLC 

FLTHKR\WFCSL*RQGWVSRWSHE*GCTR 

CWRLGKFLWVADRFLGSG 


2900 


A 


1 


1462 


MKAMPWNWTCLLSHLLMVGMGSSTLLTR 

QPAPLSQKQRSFVTFRGEPAEGFNHLWDE 

RTGHIYLGAVNRIYKLSSDLKVLVTHETGP 

DEDWKCYPPRIVQTO^LTTTNNV>nCM 

LLIDYKENRLIACGSLYQGICKLLRLEDLFK 

LGEPYHKKEHYLSGVNESGSVFGVIVSYSN 

LDDKLFIATAVDGKPEYFPTISSRKLTKNSE 

ADGMFAYVFHDEFVASMIKIPSDTFTnPDF 

DIYYVYGFSSGNFVYFLTLQPEMVSPPGST 

TKEQVYTSKLVRLCKEDTAFNSYVEVPIGC 

ERSGVEYRLLQAAYLSKAGAVLGRTLGVH 

PDDDLLFTVFSKGQKRKMKSLDESALCIFI 

LKQINDRIKERLQSCYRGEGTLDLAWLKV 

KDIPCSSAJRVDGPRGNALQYETVQWDPG 

PVLRDMAFSKDHEQLYIMSERQSQELCPPQ 

ELDDIFSCCQTPRSPDFSHTGTHCALDEAA 

MAWEWSHSQ 


2901 


A 


14 


348 


GliTNKIPFSVLEIRTWAHLSGRHHSAHCT 
SCAWPQVACXPLATHPSCTCTFCSLQAPGR 
PGQSPLSPRRACGPEDLPPPPYV*DLAPSLG 
PSLGPLMSOSOPRRTPPLRG 


2902 


A 


191 


1375 


EWPEGGGRYSS VP S AVHHARTCLAAELSG 

TSRPQEPRALPPETGVATAEAEKSNQPAAI 

SKXPNGQGAPLQR/RSPRLSPSPGAAQVPAL 

PMQDMSEGSSSPSPPGGHIWLASLTPCSLA 

LWNSCCQSPGSQPRGRDEGDCLVRATEPS 

ATGPDPRRTRLCSISASLVVRNTPDPGISDR 

RPGISDRRPGTSDRRPGTSDRRPGISDRRPG 

TSDRRPGTSDRRPGTSDRRPGTSDRRPGISD 

RRPGTSDRRPGISDRRPGTSDRRPGISDRRP 

GTSDRRPGTSDRRPGISRLPRDWIPAAAAS 

RENSNSADARNRCSSPSRKCQTPTSHRMR 

GSAGSVGSSAGHTAGGTGLPTPSRCSQAL 

QVFPAVLGKRGFLSWERSLKQRDIRGPDFS 

STAL1 


2903 


A 


1 


2547 


MRXYNSLVVDMRKVSVVWIDQASHNIPLS 
QSQIQIRPFNSVKAERGEEATEEELEANTAS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X-Unknown, *-Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










GCASRLHSYLLALHCFTVRLCGVPSPHLFA 

SSTASLPESPGCCMHSLVTKSPCGDPLEPD 

DATLFKQNLFYLETLNTKQKLYHKKIFRTA 

MLFQFVNVLLQVLVHKSHDLLQEEIGIAIY 

NMASVDFDGFFAAFLPEFLTSCDGVDANQ 

KSVLGRNFKMDRVCEGISSLSRLQNELSYI 

EKFTDFLRLFVSVHLRR1ESYSQFPWEFLT 

LLFKYTFHQDLDIQPSQAVFGGIEFTYILVT 

LVTLGTQRVPKPGCGQGGRANCPNSGANA 

TANGTAAPAAAAAAATAYGERPTWRRAD 

TAGRPATNASASGFPHRIELKAGKTTTLED 

GRQINGADYLAAPVPGKALAEFGDTGPCD 

AALDLAKGVDVMVHEATLDITMEAKANS 

RGHSSTRQAATLAREAGVGKLIITHVSSRY 

DDKGCQHLLRECRDFKATRPNEKWVTDV 

TEFAVNGRKLYLSPVIDLFNNEVISYSLSER 

PVMNMVENMLDQAFKKLNPHEPIPVLHSD 

QGWQYRMRRYQNILKEHGIKQSMSRKGN 

CLDNAWECFFGTLKSECFYLDEFSNISEL 

KDAVTEYEBYYNSRRISLKLKALAVALANI 

DPEELTSCADACKRTALVANPWQLGNVR 

DARTYKELLDQIAELLRILGSADRLMEVIR 

EELELVREQFGDKRRTEITANSADINLEDLI 

TQEDVWTLSHQGYVKYQPLSEYEAQRRG 

GKGKSAARIKEEDFIDRLLVANTHDHILCF 

SSRGRVYSMKVYQLPEATRGARGRPIVNL 

LPLEQDERITAILPVTELGIL _ 


2904 


A 


165 


638 


MFVIAFLSPLSLEFLAKFLKKADTRDSRQAC 

LAASLALALNGVFTNTIKLIVGRPRPDFFY 

RCFPDGLAHSDLMCTGDKDWNEGRKSFP 

SGHSSFAFAGLAFASFYLAGKLHCFTPQGR 

GKSWRFCAFLSPLLFAAVIALSRTCDYKHH 

WQGPFKW* 


2905 


A 


1 


2301 


MGWDCGLARWARVGLRERAAVQPLAPG 

CAAMSFAFPPFIPQGYKTAFGVGTNKIVTQ 

DNRWELPGAWYFPRASSQAREMPQCPTLE 

SQEGENSEEKGDSSKEDPKETVALAFVREN 

PGAQNGLQNAQQQGKKKRKKKRLGLKAG 

EWGAMLMIGDQSIQLPAFLSSIVRRAAQQ 

YGFTIEGGEDDDWTLYWTDYSVSLERVME 

MKSYQKINTiFPGMSEICRKDLLARNMSRM 

LKMFPKDFRFFPRTWCLPADWGDLQTYSR 

SRKNKTYICKPDSGCQGKGIFITRTVKEIKP 

GEDMICQLYISKPFnDGFKFDLRIYVLVTSC 

DPLRIF VYNEGLARr A 1 1 o Y sKrL l u nljjui 

CMHLTOTSINKHSSNFSRDAHSGSKRKLST 

FSAYLEDHSYNVEQIWRD1EDVHKTLISAH 

PHRHNYHTCFPNHTLNS A CFEILGFDILLDH 

KIJ^AVLLEVNHSPSFSTDSRLDKEVKDGL 

LYDTLVLINLESCDKKKVLEEERQRGQFLQ 

QCCSREMRIEEAKGFRAVOLKKTETYEKE 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










NCGOFRIJYPSLNSEKYEKFFQDNNSLFQN 

TVASRAREEYARQLIQEIJ11XREKKPFQM 

KKKVEMQGESAGEQVRKKGMRGWQQKQ 

QQKDKAATQASKQYIQPLTLVSYTPDLLLS 

VRGERKNETDSSLNQEAPTEEASSVFPBCLT 

SAKPFSSn>DLRNlNLSSSKLEPSKPNFSIKE 

AKSASAVNVFTGTWSILEAEKSKIKVLAS 

LMSGEGLFLIDGSFLLCPHTVEGAS 


2906 


B 


1 


1518 


MVNTERQLDWIERCQVLILALSEEINPELPE 

AIVMASSEWTRQDNID SPQEPPPTPLFASR 

PVTRLKSWRAPRVRPVGPRTHPVVISPVPE 

OISIDILRSWQNPfflGTLTGRVRAVMVRKA 

KWKPLELSLPRKJVNQKQYCVPGGIVEISA 

TTIGDLKDAKVVIPTISLFNYPIWLVQKNDG 

SWRMAVDYHKLTQGVTPIAAAVPNVISLL 

EQINTSSGTWYAAIYLVNVFFS1PVHKALK 

KQFAFSWQGQPYTFTIIPWGHINSPTLCYN 

LIWRJELDHFSLPQDITLVHYIDDIMLIGSSE 

QEVANTLDLLEKALQQVQAAVQAALPLGP 

YDPADPWLEVSVADRDTVWSLCSCCYTP 

WFGTLSHVSNLQTWSPCPPPVSPVGSQRPQ 

l^REKNKNTKRIHSIPEVLIMKPYFTAVAKP 

SLLSHKWLPLEKPENPCCYSSDHRTAVPNL 

LLYRRSTRRKTELTNKELTSAHFTGDLPRR 

AVWVLGDRTAVRPSLEOGMALWI 


2907 


A 


2 


266 


KGSTEAFISGTAGWGTGLLPSSAGLPGGW 

GPAGGWAGTDRRGPRARPIPQKSPPWPWS 

GDAAKGQSGFLPVAAWAGQGRLPGGG1IV 

K 


2908 


B 


494 


641 


MADLEQLGLNPGLEGTHHLHHPGHMGAK 

LDKQHPHDRVPTRKSDPACGMGTAVAHH 

LAPGWLRAAVTQTPFKFCQWKLCSCVNIA 

GDSFSPWYGGISVAHPEPTVTASPTTQGSA 

LPPGEENP SE WLC AF SKREAQ YEHS LRP L 

KEDRTVYRVGPNKRGKRRTVLKHMQWKL 

IKGAYRRGQLLANNQAEHKWSRKINQDC 

FILEGGTAWKQHALSESSRHALAQFFT^MH 

LPAQPGALRAPLLLTLAALVFIVGVQSRGS 

RSRFLGCLEPIERSFLGVLPRSWERSVLCLP 

VNSLQGACLRLPAAADSSIFKRS 


2909 


A 


149 


300 


TRRGGCPEEKVEELKLWEKCVHSLYRHSS 
SALDLQKIPGAIYIPSGFPLR 


2910 


B 


312 


466 


MGQVWVLVHSTLEPFHTNNEEEAKYNEV 
TEEVTEQVCLPAKANAAKEKEVHPYPSAP 

t xnrurnT^ mirnTM3"DF»T QT7T PnTfJOnP^T T^H 
LNY vUcSs bVV rUrriJLt&S? LJ2U i \j\Juroi^i oxi 

WQLTKEAEAELQLIEKQVHKAQINRJDPEK 

IPDLLIFSTQHSPTGVIVQEQDLVEWFFLPH 

TDSWTLTPYLDQITTMIGIGRTR1VKLHGY 

DPGKEVPLMKAQIQQAFINSLTWQTHLAD 

F VGILDNHFP KMKLFQFLKLTNCELPKITKF 

KPIEGAENVFTDGSSNGKASYFGSKRKVFQ 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










TPYTSAQKVELVAVIELLTAFDMPINVISDS 

SYVVHSTQIJENAQLRFHTEEKLMTLFTQL 

QTAVRSRMHRFYITHIRAHTHLPGSLTEGN 

QMADRLVATAVSNARHFHSLTHVNASGL 

KHRYSITWKEAKAHQRCPTCQVVHSSSFT 

GGVNPRGLEPNSLWEMDVTHVPSFGRLAY 

VHACVDTFSHFVWATCQSGESSAYVKRHL 

LQCFWIGILASIKTDNAPGYTSQALATFFS 

IRNIKHITGrPYNSQGQAIVERMNLSPETAV 

AKSKKKGGKQGLRGHPICN 


2911 


A 


3 


415 


ETGRHRSQQSVSSPPVQPRGKRAMYHSAA 
ELVSRGFPRPPVQAPAEPAGAAEGVHSQPA 
SRQEA/GS/TEVRGQAHRFVSPPNAAGAGD 
G/PDPQSLLAPTNRPCPPGGISPARSEPVPPA 
PGRAAP * CFPDLPGL APPLC 


2912 


A 


178 


423 


MLL1PYFLEWKKLWPLAVLSLAWLTYDW 

OTHSQGGRRSAWVRNWTLWKYFRNYFPV 

KLVKTHDI^PKHNYEANHPHGILSF 


2913 


A 


52 


228 


MLTLPQSLWMLTRRTICFVPT1VSCRGLLPS 
NPHHELARLISVSQHRVWPHPVGTQYL* 


2914 ! 


A 


447 


1331 


SHPLI^CTEKVSAKLRAAAEAAAEERRTR 

GAGSRGICAGLRSVAPGPEPLKQEEGRRE 

WGSSIGTPSPCGSAQAAAAAAAEEATEKIP 

ALRPALLWALLALWLCCATPAHALQCRD 

GYEPCVNEGMCVTYHNGTGYCKCPEGFL 

GEYCQHRDPCEKNRCQNGGTCVAQAMLG 

KATCRCASGFTGEDCQYSTSHPCFVSRPCL 

NGGTCHMLSRDTYECTCQVGFTGRNPKCP 

GGNLNYQFNGIIVVYSGGSVPPSGTKTSKP 

AEHNAMGTGSKNFASGTLWVMVSGATST 

STSTL 


2915 


A 


160 


409 


DSPTSVIWSSSTGKYSPHPSAGRWRGYCP 
RRVLCCPSPEAALEPGRARAQGIRGDSPW 
HGPTCTQPGRKTV1VGIQLPTQAI | 


2916 


A 


1578 

• 


685 


VFLQQGLAQRTDLIGR1YQSWLAIMPGCNH 

SNITQLHMI^GLRIYHNKSAPVIEVYCPQKP 

ICKQNWTWLEIMKVFVWED CIAKQ AE VLC 

NNSYGIHDWSPKGMFSLNCTCQSVCHSHT 

MFS WSEQNS QMVEMVRNTAR VPTIWKRG 

GIVAPQPQMIWSTVEAKHKDLWKLLMSV 

NKIKJWERIKKHLEGHSTNLFLDMAKLKEQ 

IFKASQAHLTLMPGTGVLKGAADKLAASN 

PLKWMKTLGSSVISMMIVLUCVVCLCVV 

CRCRS*LLREVAHRDKAAFAFIALQKQEG 


2917 


A 


118 


399 


KWKKYPLGFQTFSNNSQWDTSEFLCSSLL 
YVLGVSSQNAVNQYSIERSIVGGDCCPFFP 
WYVHHSWATLKEQRLFLAQQQQEDHEDC 
TKFEVPH 


2918 


A 


2 


335 


EDRSAFRPRQPHTLHPLH ARS LAPRSPTPP S 
PPSPDTQLGLSGPTSGPESAPTA/PGNPSWR 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possib!e nucleotide insertion) 










SSRWGSSSPCAASST*KSPYP*/CSPT/CAFP 
SPRLPFCRSAYOPAAGAGRGK 


2919 


A 


486 


248 


VRQLFSLLLPRLECNGVISAHCNLRLPGSC 

DSSASAS*VARITGASGSQA\ r VLQVQCLQP 

VOPGELLRVDLFOLWLQR 


2920 


A 


3 


535 


AARQQHCTQVRSRRLMKELQDlAlO^iiJKl* l 
SVELVT)ESLFDWNVKLHQVDKDSVLWQD 
MKETNTEFILLNLTFPDNFPFSPPFMRVLSP 
RLENGYVLDGGAICMELLTPRGWSSAYTV 
EAVMRQFAASLVKGQGRICRKAGKSKKSF 
SRKEAEATFKSLWKTHEKYGWGHPARVP 

DG 


2921 


A 


3384 


1260 


AGQTPGHRASGPSERSPAPRSRLQPGGEAA 

TRTEPATTGRRAGPGSATMEALMARG\AL 

TGPLRALCLLGCLLSHAAAAPSPIIKFPGDV 

APKTDKELAVQYLNTFYGCPKESCNLFVL 

KDTLBCKMQKFFGLPQTGDLDQNTIETMRK 

PRCGNPDVANYNFFPRKPKWDKNQITYRIE 

GYTPDLDPETVDDAFARAFQVWSDVTPLR 

FSRIHDGEADIMINFGRWEHGDGYPFDGK 

DGLLAHAFAPGTGVGGDSHFDDDELWTL 

GEGQWRVKYGNADGEYCKFPFLFNGKE 

YNSCTDTGRSDGFLWCSTTYNFEKDGKYG 

FCPHEALFTMGGNAEGQPCKFPFRFQGTSY 

DSCTTEGRTDGYRWCGTTEDYDRDKKYG 

FCPETAMSTVGGNSEGAPCVFPFTFLGNKY 

ESCTSAGRSDGKMWCATTANYDDDRKW 

GFCPDQGYSLFLVAAHEFGHAMGLEHSQD 

PGALMAPIYTYTKNFRLSQDDIKGIQELYG 

ASPDroiXiTGPTPTLGPVTPEICKQDIVFDGI 

AQIRGEIFFFKDRFIWRTVTPRDKPMGPLL 

VATFWPELPEKIDAVYEAPQEEKAVFFAG 

NEYWIYSASTLERGYPKPLTSLGLPPDVQR 

VBAAFhWSKKOCTYIFAGDKFWRYNEVK 

KKMDPGFPKLIADAWNAIPDNLDAVVDLQ 

GGGHSYFFKGAYYLKLENQSLKSVKFGbl 

KSDWLGC 


2922 


A 


155 


575 


RRAQGEPERRAP SLA WTCRDPIPTREEL AL 

TSTTTSCISSLSIVPFQTILVGDSGVGKTSLL 

VQFDQGKFBPGSFSATVGIGFTNKVGTVDG 

VREKIPIWTPAGKERFRSVTHAYYRDAHG 

♦FLLYDPNHRISLLRLSAL 


2923 


C 


188 


207 


MWHLSV 


2924 


A 


3 


453 


VRSDMNSNPLNDGRYRAPPAPRAPAEAGAS 
SQP*SPPAAQ ASGKEGGENNAPLFQ*TPLPT 
TPTDTLSVP\PRAPVPPSDRFLRSRPPGPRPS 
FPFRLQGGGGAPH*RGSSATPTPPA/SAPGP 
GVRSLPRPRWWTPIRLKKPWOKSADPSLQ 


2925 


A 


711 


4 


GARFACLCSTTPAPMASCLGLLILSSCLLA 
DCRFIPEAWSACTVTCGVGTQVRIVRCQV 
LLSFSQSVADLPIDECEGPKPASORACYAG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^Unknown, *==Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










PCSGEEPEFNPDETDGLFGGLQDFDELYDW 
EYEGFTKCSESCGGGVQEAWSCLNKQTR 
EPAEENLCVTSRRPPQLLKSCNLDPCPASSL 
WEPKCV GKGHQLF YLTTVLSSRKKQYRL 
SMERLQRSLLGNQEAWLLILLSPTSSVA 


2926 


A 


2126 


2241 


RQGFHHVGQAGLKLLTSGDLPALASQSAG 
IAGMTHSAR 


2927 


A 


830 


1143 


NDQSALVRARSSFSECSVKPRTHQFFHMFNI 
GPARDGPPPPSPAPHGPGTLPYRGSSRPGSP 
PPPPRTPPVSSFLCHSSGAPVTRRDAAAQA 
HLLCSRFPFSFIG 


2928 


A 


1 


782 


MTKIQEPSTSVKFLGVQWSGAYQDIPSKV 

KI)KJLLHIAPPTTTKEAYLGL/FGFWRQHIP 

H/LGTEQEKTLQHVQAAVQVALFLEPYDP 

ADPMVLEVSVADRDAIWSLWQAPISESQW 

RPQGFWSKALPSSAANYSPFERQLLAYYW 

ALVETEHLTMGHQVTKQPEIJ>IM^ 

PSSHKVGCAQQHSIIKWKWYICDRARAGP 

EGTTTPVITQWAHEQSGHGGRDGGYTWA 

QQQGLPLTKADLATATAECPICQQQRPTLS 

P 


2929 


A 


1 


274 


MARATLSAAPSNPRLLRVALLLLLLVAAS 
RRAAGASWTELRCQCLQTLQGIHLKNIQS 
VNXATLKNGKKACLNPASPMVQKIIEKILN 
NP 


2930 


A 


1 


1236 


MLIGS SEQE V ANTLDLF VRHLHAREWEIKL 

TKIQGPSTSVKFLGVQWYGACQDIPSNVK 

DTLLHLAPPITKKEAQCLLGLFGFWRQHIP 

HLELPIKNWVLSDPSSYKVGCAQQYS11KW 

KWYICDWAQANPEGTINGLARWSGTWKK 

HNWKIGDKEIWGRGMWMDLSEWSKTVKI 

YVSHVSAHQQMTSAEEDFNNQVDRMTRS 

MDTTQPLSPTTPVTTQWAHEQSDHGGRDG 

DYTWAQQHGLPLTKSFTFAKEVWQWAHA 

HGIHWSYVPHHPEAAGLIERWNGLLKSQL 

KCQLGDNTLQGWGKVLQKAMYALNQHPI 

YGTVSPIARLHGSRNQGEEVEVAPLIITPGD 

LLAKFLLPVSTTLHSAGLGWYGFKLTRD 

GLVMVNTECQLDRIEGCKVLFLGVSVRVS 

PKEINI 


2931 


A 


3 


714 


RRPFIALCLSNVAFMLPWQFAQFILFTQIAS 

LFPMYVVGYIEPSKJQKIIYMNMISVTLSFI 

LMFGNSMYI^SYYSSSLLMTWAITLKRNEI 

QKLGVSKLNCWLIQGSAWWCGTIILKFLTS 

KILGVSDHtCLSDLIAAGILRYTDFDTLKYT 

CSPEFDFMEKATLLIYTKTLLLPWMVITCF 

IITCKTVGDISRVLATNVYLRKQLLEHSELA 

FHTLQLLAFTALAILILRLKLVL 


2932 


A 


1 


699 


MRFVMSVTMYHTTLVGLDIKHLNLESGKV 

WVMGKASKEPRLPIGRNAVAWIEHWLDL 

RDLFGSKDDALFLSKLGKRISARNVQKRFA 
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ID 
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nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


"Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X= Unknown, *=Stop 
codon, /^possible nucleotide 
deietion,=possible nucleotide insertion) 










EWGIKQGI^JNHVHPHKLRHSFATHMLESS 
GDLRGLFRFVSAKRHAKGSKVGSPIIYADQ 
IIIGAGQNHPARWRGLPRKSRLLVSPSNDK 
RRKAGAAPVAALRHFPPISIENArVKIQFRII 
RRLNHQQLVKPYPQVPISOATNQFR 


2933 


A 


1 


924 


MFAIISYSSLAAVLLTATLTAAGIISFPVALC 

LVIGANLGSGLLAMLNNSAANAAARRVAL 

GSLLFKLVGSLHLPFVHLLAETMGKLSLPK 

AELVIYFHVFYNLVRCLVMLPFVDPMARF 

CKTIIRDEPELDTQLRPKHLDVSALDTPTLA 

LANAARETCALATPWTDDGRKYAYSAAS 

GGRRSATKVMVWTDGESHDGSMLKAVI 

DQ(^FHDNILIO?GIAVLGYLNRNALDTKNLI 

KEIKAIASIPTERYFFNVSDEAALLEKAGTL 

GEQIFSIEDMDLGDEVYTVGRPHPMIDPTL 

RNQLIADLGAKPQVRVLLLDWIGFGATA 

DPAASLVSAWQKACAARIDNQPLYAIATV 

TGTERDPQCRSQQIATLEDAGIAWSSLPE 

ATLLAAALIHPLSPAAQQHTPSLLENVAV1 

NIGLRSFALELQSASKPWHYQWSPVAGQ 

GKWLANPELLEADADAEYAAVTDIDLADI 

KEPILCAPNDPDDARPLSAVQGEKIDEVFIG 

SCMTNIGHFRAAGKLLDAHKGQLPTRLWV 

APPTRMDAAQLTEEGYYSVFGKSGARVSSI 

PCAVPCVWARVADGATVVSTSTRNFPNRL 

GTGANVFLASAELAAVAALIGKLPTPEEYQ 

TYVAQVDKTAVDTYRYLNFNQLSQYTEK 

ADGLLKPRFRPWQRKILDTLATYHEQHRD 

EPGPGRERLRRMALPMEDEALVLLLIEKM 

RESGDIHSHHG WLHLPDHKAG* SSDNGKY 

QRLFYLPAPRRSGTLPASAVCQSAPQQ/LA 

SSAEARKTFAPVPRRFGKLRVEVETTVAPS 

ATRAHTQGTAOGILDTRAPLLPKTL 


2934 


A 


201 


632 


MPGLLNWITGAALPLTASDVTSCVSGYAL 

GLTASLTYGNLEAQPFQGLFVYPLDECTTV 

IGFEAVIADRVVTVQDCJDKAKLESGHFDAS 

HVRSPTVTGNILQDGVSIAPHSCTPGKVTL 

DEDLERUFVANLWTIAPMYRAVWD 


2935 


A 


267 


25 


MGAVQRLMKIIMLNYRLVAHFLVLFAQK 

KANRQRTRVHRGSLWLSECESPNGPGGRH 

TEPAEGROARGRTPQQGFAVSLM* 


2936 


A 


34 


330 


MNKHFLFLFLLYCLIAAVTSLQCITCHLRT 
RTDRCRRGFGVCTAQKGEACMLLRIYQRN 
TLQISYMVCQKFCRDMTFDLRNRTYVHTC 
UN Y JN Y UN r IsJ- 


2937 


A 


34 


411 


MTAGTVVITGGILATVILLCIIAVLCYCRLQ 

YYCCKKSGTEVADEEEEREHDLPTHPRGP 

TCNACSSQALDGRGSLAPLTSEPCSQPCGV 

AASHCTTCSPYSSPFYIRTADMVPNGGGGE 

RLSFAP 


2938 


A 


333 


545 


MMPTNL AHL VFWQ ALLAS GRF SLMEHYP 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deIetion,=possible nucleotide insertion) 










PNVQSNRGITHYMLPRGYILGLLYSSAGNT 
GTSRPRRTHYGT* 


2939 


A 


242 


382 


M^VMRGI^ITTTCIXSMLQArriSPSILW 
NHAAVQYVHGHSLVQA* 


2940 


A 


108 


290 


MPQWLALQRQQALLTLLSGAGTWAGMRP 
PSQCWPQGPSTGNQSLSHGRGELLTHAVG 
VCI* 


2941 


A 


109 


417 


MLMIJLVTGVSSLRNMIMCDYISRAKLKSS 
HIVLSYCIXKQEYDDSRGVMNLEAREEGS 
RGFYCLGCIDTGLQTPGGRGPSSALVTSVH 
LACEEYSKHSFVK* 


2942 


A 


155 


575 


RRAQGEPERRAPSLAWTCRDPIPTREELAL 
TSTTTSCISSLSIVPFQTILVGDSGVGKTSLL 
VQFDQGKPIPGSFSAWGIGFTNKVGTVDG 
VREKLPIXWTPAGKERFRSVTIiAYYRDAHG 
♦FLLYDPNHRISLLRLSAL 


2943 


A 


429 


1 


RLVYASTANKIHF*NDNNPGKNTDTVPHC 
HKLCNQDSHIRGNHRGQH1HSKTAKPCSG 
KTTFVIITFLI^DKHKYKLAPLRPAAASYSS 
PFTRKVTCLTRITEPS*P *HTAATLRSDQRS 
QTCSHGTGTLSWRSSRWRSSSTK 


2944 


A 


1728 


2782 


RASSAVRGSLGDSARGRRRRSIVKVSLHPA 

VMSKSESPKEPEQLRKLFIGGLSFETTDESL 

RSHFEQWGTLTDCWMRDPNTKRSRGFGF 

VTYATVEEVDAAMNARPHKVDGRVVEPK 

RAVSREDSQRPGAHLTVKKIFVGGIKEDTE 

EHHLRDYFEQYGK1EVIEIMTDRGSGKKRG 

FAFVTFDDHDSVDKIVIQKYHTVNGHNCE 

VRKALSKQEMASASSSQRGRSGSGNFGGG 

RGGGFGGNDNFGRGGNFSGRGGFGGSRG 

GGGYGGSGDGYNGFGND GSNFGGGGS YN 

DFGNYNNQSSNFGPMKGGNFGGRSSGPYG 

GGGQYFAKPRNQGGYGGSSSSSSYGSGRR 

F 


2945 


A 


234 


657 


VQQPGRGLDLSTDGPGGRSQVGUWSCCC 
LH* AAS GEPGGRCPGS/GAPGP AGS ALEFR 
ARDGVP\GVGGPSWESHSPAAATPPPAECR 
GPGPTPSPAPGEAAPEDREDGAAAPGRAEP 
ASIVAPADGSOGOVLATQAGALGA 


2946 


A 


1725 


2140 


YTYQISQTSGKL*PGDKSVHSELV/SSCNTSI 

ISSSGISSTSLL*LRRLFSAASANSASSVASK 

K*ASSMPLSQTASADAPVDSLLGDGL*GF 

WVSLLLVSSASSWNSSSSLKKNRRHTSAG 

NGKQSDLKFFALHTGS 


2947 


A 


1 


1134 


DTYCRGDQLHILLWRDHLGRRKQYGGDF 
LRARRSSPALMAGASGKVTDFNNGTYLVS 
FTLFWEGQVSLSLLLIHPSEGVSALWSARN 
QGYDRVIFTGQFVNGTSQVHSECGULNTN 
AELCQYU)NRDQESFYWVRPQHMRCAAL 
THMYSKNKKVSYLSKQEKSLFERSNVGVE 
IMEKFNTISVSKCNTLKSVDLHESGKXQHQ 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LAVDLDRNMQWQKYCYPLIGSMTYSVK 
EMEYLTRAIDRTGGEKNTVIVISLGQHFRP 
FPmVFIRRALNVHKAIQHLLLRSPDTMVn 
KTENIREMYNDAERFSDFHGYIQYLIIKDIF 
QDLSDIRHVLKYNASKNAADLDLFFSSNL 
DDFYNFSELHKGRSKSPLMQITQ 


2948 


A 


504 


198 


QLIQHQTVHTGRKLYECKECGKAFNQGST 
LJRHQRIHTGEKPYECKVCGKAFRVS SQLK 
QHQRIHTGERPYQCKELKGRGAEMLAVLA 
VKEONRTPVNYGK 


2949 


A 


1 


578 


MGETALMIQLPPPGPALGTWGLWDLQFKT 

NTTSTDTDPRSHLQETGDNILTLFTMHPPL 

ESEWTICNFRQIWLLSSWSTLETRAQPLHS 

YFRKLKGRGTAIAGIVFGIVFIMGVIAGIAI 

CICMCMKNHRATRVGILRTTHTNTVSSYPG 

PPPYGHDHEMEYCADLPPPYSPTPQGPAQR 

SPPPPYPGNARK 


2950 


A 


1 


943 


AAAGRARGAGDMFRRKQSNPRQIKRSLGD 

MEAREEVQLVGASHMEQKATAPEAPSPPS 

ADVNSPPPLPSPTSPGGPKELEGQEPEPRPT 

EEEPGSPWSGPDELEPWQ/DGRRRIRARLS 

LATGLSWGPFHGSVQTRASSPRQAEPSPAL 

TLLLVDEACWLRTLPQALTEAEANTE1HRK 

DDALWCRVTKPVPAGGLLSVLLTGEPHST 

PGHPVKKEPAEPTCPAPAHDLQLLPQQAG 

MASIIATAVTNTOVFPCKDCGIWYRSERNL 

QAHLLYYCASRQGTGSPAAAATDEKPKET 

YPNERVCPYPOSRKSCPG 


2951 


A 


2 


435 


AVCRTSSDVDDNPPVFNQL1YESYVSELAP 
RGHFVTCVQASDADSSDFDRLEYSILSGND 
RTSFLMDSKSGVTTLSNHRKQRMEPLYSLN 
VSVSDGLFTSTAQVHIRVLGANLYSPAFSQ 
STYVAEVRENVAAGTKVIHVRATD 


2952 


A 


199 


399 


MPGSLCGRRTVCWLLGSVTSKQVLlhULR 
KFSRSSRLQEDQERSLGFRPFTHSPDMMW 
DLPAQDEWS 


2953 


A 


38 


397 


TVLCLTLTSCSFRQSLAT* SFGG/MGSGS VH 

FGVGGAFLEPSIHWGS/GSRSLSVSSTHFVP 

SSSS/GGYGSGDASVLCRSDRLLTGTKITTQ 

NIHD/RLGSYLDKVRALEEAG\ELKVKICD 

WAP 


2954 


A 


2 


673 


NSRVEGQLCDLDPS AHFY GHCGEQLECRL 

DTGGDLSRGEVPEPLCACRSQSPLCGSDGH 

TYSQICRLQEAARARPDANLTVAHPGPCES 

/^Dr^rv/CTTDVT^xWTsJVTr^nVTFnCEVFAYP 

MASIEWRKDGLJDIQLPGDDPHISVQFRGGP 

QRFEVTGWLQIQAVRPSDEGTYRCLARNA 

LGQVEAPASLTVLTPDQLNSTGIPQLRSLN 

LVPEEEAESEENDDYY 


2955 


A 


1 


440 


GNQKCTRNNHRISSLLCDPQEGYLQMLQIS 
NLYLYDSVLMLANAFHRKLEDRKWHNM 
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Table 8 



SEQ 
ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *-Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ASLNCIRKSTKPWNGGRSMLDTIKKGHITG 
LTGVMEFREDSSNPYVQFEIIX3TTYSETL\E 
EPFVMVAENILGQPKRYKGFSIDVLDALA 


2956 


A 


23 


395 


GSGDAGGQHRARCPSGRAGNWDWHPPA 

MEEPGPPGGLSQDQVERCMGAMQEGMQ 

MVKLRGGSKGLVRFYYLDEHRSCIRWRPS 

RKNEKAEISIDSIQEVSEGRQSEVLQRYPDG 

SFDPNCCCSI 


2957 


A 


663 


144 


KELSAVSAGIPHSCGSQGCGGGSVAACVP 

AAPAAAGLCSGRAQKVPPPPSLAGWPPGV 

NAPPPPVCSSVRLHVCQSDRLWVRLAARR 

GILALLRSALKAATLAGCQSVRWSVRPSES 

LRPTSNAASLFRS SVPTVLSHS VPLAASLG 

KRRACGGREHASVAVYLSVCLSLPT 


2958 


A 


1856 


591 


PPTPTAETLTSEDAQPGSPLATGTDQVSLD 

KPLS S AAHLDD AAKMPS ASS GEEAD AG SL 

LPTTNELSQALAGADSLDSPPRPLERSVGQ 

I^SPPLTJTPPPKASSKTAKKMSQAKPHSSK 

PPA*RVPTI7PLRGQLSTPTGSPHLTTVHRP 

LPPSRVIEELHRALATKHRQDSFQGRESKG 

SPKKRLDVRLSRTSSVERGKEREEAWSFD 

GALENKRTAAKESEENKENL1INSELBCDDL 

LLYQDEEALNDSnSGTLPRKCKKELLAVK 

LRNRPSKQELEDRNIFPRRTDEERQEIRQQI 

EMKLSKRLSQRPAVEELERRMLKQRNDQ 

TEQEERREIKQRLlRJKlisrQRPTVDELRDRK 

LLIRFSDYVEVAKAQDYDRRADKPWTRLS 

AADKAAIRKELNEYKSNEMEVHASSKHLT 

RFHRP 


2959 


A 


1578 


685 


WLQQGLAQRTI1LIGRIYQSWLAIMPGCNH 

SMTQIJEIN1LSGLRIYHNKS AP VIE V YCPQKP 

ICKQNWTWLEIMNVFVWEDCIAKQAEVLC 

WJSYGIiroWSPKGMFSLNCTCQSVCHSHT 

MFSWSEQNSQMVEMVRNTARVPIIWKRG 

GIVAPQPQMIWSTVEAKHKDLWKLLMSV 

NKJKIWERJQCKmEGHSTNLFLDMAKIJCEQ 

IFKASQAHLTLMPGTGVLKGAADKLAASN 

PLKWMKTLGSSVISMMIVLLICWCLCVV 

CRCRS*LLREVAHRDKAAFAF1ALQKQEG 

GYAGE 


2960 


A 


470 


258 


MIIAIGGVWASGLVFIVLLMIRYKVYGDG 

DSRRVKGSRALPRVRHVCSQTNGAGTGAE 

QAPALPAQDHY* 


2961 


A 


3 


866 


ELNI^DFSHLDHRDLIPIIAALEYNQWFTK 

LSSKDIXLSTDVCEQILRVVSRSNRLEELV 

LENAGLRTDFAQKLASALAHNPNSGLHTI 

NLAGNPLEDRGVSSLSIQFAKLPKGLKHU 

LSKTHYYPKAVNSLSQSLSANPLTASTLVH 

LDLS GNVLRGDDI^HNnrNFLAQPN A1VHL 

DLSNTECSLDMVWGALLRGCLQYLAVLN 

LSRTVFSHRKGKEVPPSFKQFFSSSLALMHI 
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Table 8 




SEQ 
ID 

NO: 


Method 


rreuiciea 
beginning 
nucieouue 
location of 
first amino 
acid residue 
of peptide 
sequence 


X I CUIV-lCU 

ending 

location of 
last amino 
acid residue 
of peptide 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










Nl^GTKI^PEPLKALIXGIACNHNLKGVSL 
DLSNCELRSGGAOVLEGCIG 


2962 


A 


574 


203 


TQ AFEQEVGNPLCIPSHCMGA VFILLNLAT ! 

AHSSGLCXLQLELSFRSLSTTAVHCCPRPTI 

DFHP/LGSSRVSAVLLIQ/QRCPLPLPIGLEA 

DHCSCMAKGPGFTLIELNTSHWVPQFSSVT 

HDFY 


2963 


A 


399 


15 


KTMVAHHIVENTYFCPVLATGLSGLYSSLP 

TKLEEKGEEWHCLLKDDWLLLPSLVQFM 

NSLEFCNAVTQVAHPLIRNQLVIYISNEFLV 

PVLAPALHKVPVQEVMSPTAYLDLFVRSIS 

EPALLEIF 


2964 


A 


3 ! 


567 


CSEIFASLRLPRIMAHSKQPSHFQSLMLLQ 

WPI^YLAIFWILQPLFVYLLFTSLWPLPVL 

YFAWLFLDWKTPERGGRRSAWVRNWCV 

WTHIRDYWmLKTKI)l^PEHNYLMGVHP 

HGLLTFGAFCNFCTEATGFSKTFPGITPHLA 

TXSWFFKIPFVREYLMAKGASDHTYWSFW 

SMFLLGNAPF 


2965 


A 


2 


394 


TLADGGEGQFDGTFEPATVALPGGEHAEN 

AVQIHKWTGTMAUFSFLIAALVLYVSWK 

CFPASLRQLRQCFVTQRRKQKQKQTMHQ 

MAANlSAQEYYVDYKPNHIEGALVnNEYG 

SCTCHQQPARECEV 


2966 


A 


2 


412 


EFLS SNQITQLPNTTFRPMPNLRS VD LS YN 

KLQALAPDIJHGLRKLTTLHMRANAIQFV 

PVRIFQDCRSLKFLDIGYNQLKSLARNSFA 

GLFKLTELHLEHNDLVKVNFAHFPRL1SLH 

SLCLRRNKVAIWSSLDW 


2967 


A 


1 


1343 


ERCKVQSSTLVSSLEAELSEVK1QTHTVQQE 

NHLLKDELEKMKQLHRCPDLSDFQQKISS 

VI^YNEKLI^KEALSEELNSCVDKLAKSS 

LLEHRTATMKQEQKSWEHQSASLKSQLVA 

SQEKVQNIJBDWQISrVNLQMSRMKSDLRV 

TQQEKEALKQEVMSLHKQLQNAGGKSWA 

PEIATHPSGLHNQQKRI^WDKLDHm/NV 

EEQQLLWQENERLQTMVQNTKAELTHSRE 

KVRQLESNLLPKIIQKHLNPSGTMNPTEQE 

KLSLKRECDQFQKEQSPANRKVSQMNSLE 

QELETIHLENEG1JKJCKQVKLDEQLMEMQH 

LRSTATPSPSPHAWDLQLLQQQACPMVPR 

EQFLQLQRQLLQAERINQHLQEELENRTSE 

TNTPQGNQEQLVTVMEERMIEVEQKLKLV 

BCRLLQEKVNQLKEQVSLPGHLCSPTSHSSF 

NSSFTSLYCH 


2968 


A 


382 


203 


RPSSPGPPCPEAGKR/RFGCGGAGSLRPEHS 
\TRPPPRGLGKGRGQREKRGASKEGSEGCA 


2969 


A 


303 


46 


AVVFKLLSPRKKHLKNPFVGGVGCAWRT 
GWEWSPGQEQAPPPATGSMLATSSPPSGPP 
PPP*PPGFMLPPLGDGLGAGTSAGRS*EKG 
RGK 
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Table 8 



SEQ I 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possib!e nucleotide 
deIetion,==possible nucleotide insertion) 


2970 


A 


3 


586 


MVECPACQH*RPTLSLRDTSYHQVECIRSL 

LPWNGHQFVLTRIDICSK*/G/FVFPNYFASS 

STTI*ELTGCLffl*HT*N*GTH/LIAKEV*Q*T 

RS YKI/HWCI/PHHPEAAS QIGFWNGLLKTG 

L/QLRLRCNALQS/WGAVLQNMVYALKCI 

GPKWIYSIVSPVG\HVHTGVASIT1TPSHSPV 

EFVPPRSEIWSQLGYDP 


2971 


A 


299 


21 


MGSSVLSIWILSPSIYPILSPLAMPCLSRTDL 
IRVRIUQGAWPSEGTASSIRGWVLTKLRMS 
SGKALEALYCIPGAAQHPGLGVTRVWSGR 
T* 


2972 


A 


1 


555 


KKVGNYYTTPIYRFRMKCHLCVNYIEMQT 

DPANCDYVIVSGAQRKEERWDMADNEQV 

LTTGERHPLTCLG AL/DPES ALGPPKPSRAL 4 

rVAEHEKKQKLETDAMFRLEHGEADRSTL 

KKALPTLSHIQEAQSAWKDDFALNSMLRR 

RFRVRGAPARGQRGCMVDQGPGPALPPPH 

PSFEOATCTF 


2973 


A 


1 


598 


MAWIPAALGTAALVPWSILRGKAPRYWL 

LPLLLDPDKVPHSARDLTSPDAALASLTAQ 

SGGLEELHLKLVHEVAVMANTECQLDWIE 

GCKVLILACRLWDLVIMTHPAFYQSVQWG 

KGNDQTFQGRLDTGCELMLIPGDPNCGPP 

VKVGVYGGIIYHCDLTKEELEPRVFREVTV 

KGEDASDYOTVQLPKGTESSRN 


2974 


B 


1 


2142 


MGGAGSPQVILVSHTPQSASAACEEIAYQV 

AG VS GNLAP GNQPEKJEGRAHQ CLECDRAF 

SSAAVLMHHSKEVHGRERIHG CP VCRKAF 

KRATHLKEHMQTHQAGPSLSSQKPRVFKC 

DTCEKAF AKP S QLERHSRIHTGERPFH CTL 

CEKAFNQKSALQVHMKKHTGERPYKCAY 

CVMGFTQKSNMKIJIMKRAHSYAVAVAM 

GGTAQCPPGATACLGTAICPSGLRAQRPSN 

LSVPEAAKPKSGRNRKEEAPTWALSTSKDP 

QTEGLRNPQTCVQIRSNPFCAFAQGFSLISE 

LRTLNCFVGLCDSQSGKQQLGFYSGQPAT 

EAWQKYSLAVCILRSEQEISATRLGLKNTN 

VNKLDGGCGAWNFLGGMSEHNSPPSGRA1 

LLPVWTE\O^GPWTPEQGSHICRMNLAPT 

FQAFLPKTGFPEDPQELLQGPIERTIWPGTV 

YTFRSAIVTARAVWVRPRMDRRADLSSAT 

QSASAEKFGGRVSAGHCALPLPARPVTAS 

VYGRLARLRGCLEDSYPSALSAQVFLDSPA 

VGCGLETRLFffiAALGPPCRATVTSRGHLL 

v^T/iTfnrmis^-n rtrf t^t fit r/""1T T T/*** firs /*\/"\Tf Z*^. A A 

DlSrTKSPGRPCFI^VCLHGSDQQJU<Js.<jAA 

ATAKRKSKGGGVNVEGRLCTWPPEDPPKS 

WSLAFGPLQEKTTELNLHPRCWARCLSHW 

ELPPGPRGRAQAPDWTGSKSFREQLLTFTL 

WGVQEKISKHQANQGKEAPAYTGLEDSDP 

GGLCAV* 


2975 


A 


248 


597 


DRCPAAWDRHPAGIQSSRREPSKATWTLR 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










SKLSVQDGRRDSSLRLNCKVAARLGAGHP 
PMLRLGLRC*YPGKQGLEWTSSKLQQTCH 
*GS*LOCGKLTNRXDIHSKTPSVRHYHQR 


2976 


A 


2 


353 


EVDHRGDYVSHEIMHHQRRRRAVPVSEVE 
PLHLRLKGSRHDFHVDLRTSSSLVAPGFIV 
QTLGKTGTKSVQTLPPEDFCFYQGSLRSHR 
NSSVALSTCQGLSGMIRTEEADYFLRPL 


2977 


A 


134 


412 


MVKPIGPRVRRGLESPLCHACbnLALCTLAL 
VRLCALSRSRSLSLMOLQAFYRPPMSQEP 
ALSTVLFLLLLLANPPTKVSRSHRKERVLL 
LVA 


2978 


A 


1 


598 


MAFLETSAPLYEHIWTLQVAFSTVGLGETL 

KVAMISMSTSSGYFLQLLQYCCSSTinTGY 

KGFLRDLKVETRADGVMRTMAPEKLLKS 

MPILQGQIDALLEFDVHPNELTNGVINAAF 

MLLFKDLIKIJACYNDGVINLLGTWMKLE 

TIILSKLLQRQKTKHCMFSLIGGNRTMRTL 

GHRKGNITHWALLAGGGAAEG 


2979 


A 


793 


1 


GSRIDDMKSERRPPSPDVTVLSDNEQPSSPR 

VNGLTTVAIJCETSTEALMKSSPEERERMIK 

QLKEELRLEEAKLVLLKKLRQSQIQKEATA 

QKPTGSVGSTVTTPPPLVRGTQNIPAGKPS 

LQTSSARMPGSVTPPPLVRGGQQASSKLGP 

QASSQVVMPPLVRG\AQQfflSIRQHSSTGPP 

PLLLAPRASVPSVQIQGQRIIQQGL1RVANV 

PNTSLLVN1PQPTPASLKGTTATSAQANSTP 

TSVASWTSTESPASRQAA 


2980 


A 


2 


1427 


LLARGAGRTNPAPPLMSCGPWGKFLKCCE 

VYKSGPYKVQ*EEITIHSRAEAESTYQIKYE 

ELQTLAGKHGDDLRCAK/T/EISEMNQNISR 

LQAETEGLKGQGASLEAAIADAEQWGELA 

IKDAOTKLSELEAAMQRAKQDMA/RQLGE 

YQKLALDIEIATYRKLLEGEESRLESGMQN 

VSMKKTTSGYAGAPARIVSLLQNELLSLE 

VGVLKGHPTGKGEELGAPYSECSFGLCRR 

TVMLTQAPSSVVRSRNSRNHTVNSGGSCL 

SASTVAIPAINDSSAAMSACST1SAQKRTCC 

TACEPARKYKDTASHQEPAVCQPACQLET 

ADPKGGGVLALPQPPSPGMLCWPYCRAH 

ATDYFLANFFSEFPCHFLHRAGAAQTQAT 

GDGMEHGQSRELPKRKAPREESETSEEKSP 

NKWGPVSKQKKQLLVDILTTIIRPTRGNAY 

TGLSTRKWKPRSEENALMQPNKKDEKGTL 

TQKLGL 


2981 


A 


4235 


940 


ARGRRSRPVWAASWGGRGRPAARRRPRG 

LAATMGFELDRFDGDVDPDLKCALCHKV 

IJEDPLTTPCGHVFCAGCVLPWVVQEGSCP 

ARCRGRLSAKELNHVLPLKRLILKLDIKCA 

YATRGCGRWXLQQLPEHLERCDFAPARC 

RHAGCGQVLLRRDVEAHMRDACDARPVG 

RCQEGCGLPLTHGEQRAGGHCCARALRA 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *-Stop 
codon, /^possible nucleotide 
delction,=possible nucleotide insertion) 










HNGALQARLGALHKALICKEALRAGKREK 

SLVAQLAAAQLELQMTALRYQKKPTEYSA 

RLDSLSRC VAAPP GGKGEETKSLTLVLHRD 

SGSLGFNIIGGRPSVDNHDGSSSEGIFVSKIV 

DSGPAAKEGGLQIHDRIIEVNGRDLSRATH 

DQAVEAFKTAKEPIVVQVLRRTPRTKMFT 

PPSESQLVDTGTQTDJTFEHIMALTKMSSPS 

PPVLDPYLLPEEHPSAHEYYDPNDYIGDIH 

QEMDREELELEEVDLYRMNSQDKLGLTVC 

YRTDDEDDIGIYISEIDPNSIAAKDGRIREG 

DRHQINGIEVQNREEAVALLTSEENKNFSL 

IJARAELQLDEGWMDDDRNDFLDDLHMD 

NILEEQHHQAMQFTASVLQQKKHDEDGGT 

TDTATILSNQHEKDSGVGRTDESTRNDESS 

EQENNGDDATASSNPLAGQRKLTCSQDTL 

GSGDLPFSNKSHSPECTGAAYLGPVDECE 

RFRELLELKCQVKSATPYGLYYPSGPLDAG 

KSDPESVDKELELLNEELRSIELECLSIVRA 

HKMQQLKEQYRESWMLHNSGFRNYNTSI 

DVRRHELSDITELPEKSDKDSSSAYNTGES 

CRSTPLTLEISPDNSLRRAAEGISCPSSEGA 

VGTTEAYGPASKNLLSITEDPEVGTPTYSPS 

LKELDPNQPLESKERRASDGSRSPTPSQKL 

GSAYLPSYHHSPYKHAHIPAHAQHYQSYM 

QLIQQKS AVEYAQSQMSLVSMCKDLS SPT 

PSEPRMEWKVKIRSDGTRYITKRPVRDRLL 

RERALKIREERS GMTTDDD A VSEMKMGR 

YWSKEERKQHLVKAKEQRRRREFMMQSR 

LDCLKJEQQAADDRKEMNII^IJSHKFCMMK 

KRNKKIFDNWMTIQELLTHGTICSPDGTRV 

YNSFLSVTTV 


2982 


A 


792 


389 


PTRPPUQLQAPRAHLSEDQKRLLLMKQKG 

VMNQPMAYAALPSHGQEQHPVGLPRTTG 

PMQSSVPPGSGGMVSGASPAGPGFLGSQP 

QAAIMKQMLTDQRAQLIEQQKQQFLREQR 

OOOOOOQOILAEQVTCPLA 


2983 


A 


3 


268 


FTRSDELARHYRTHTGEKRFSCPLCPKQFS 
RSDHLTKHARRHPTYHPDMIEYRGRRRTP 
RIDPPLTSEVESSASGSGPGBAPSFTTCL 


2984 


A 


3 


431 


GPEFPGSAKLVFLDLSYNNLTQLGAGAFRS 
AGRLVKLS L ANNNLVG VHED AFETLESLQ 
\aELNDNNLRSLSVAALAALPALRSLRLD 
GNPWLCDCDFAHLFSWIQENASKJLPKGLD 
EIQCSLPMESRRISLRACRRPASRV 


2985 


A 


108 


497 


MGIYQMYLCFLLA VLLQ L Y V A' IE A1L1AL V 

GATPSYHWDLAELLPNQSHGNQSAGEDQ 

AFGDWLLTANGSEIHKHVHFSSSFTSIASE 

WFLIANRSYKVSAASSFFFSGVFVGVISFG 

QLSDRFGRKKVY 


2986 


A 


488 


754 


QSIYQEKFDDENF1LKHTGPGCLSMANAGP 
TQMVPSFSPVWPRLSGWMASTRSLAK*EE 
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Table 8 




ID 

NO: 


A/lath nH 


Predicted 

beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










GVN1MEAMECSGSGNGETGKKIPTAXCGQ 
L 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of ! 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 












1 


1042 








2 


1043 








3 


1044 








4 


1045 








5 


1046 


2083 


2535 


790 104 


6 


1047 








7 


1048 








8 


1049 








9 


1050 


2084 


2536 


790 16362 


10 


1051 








11 


1052 








12 


1053 








13 


1054 








14 


1055 








15 


1056 








16 


1057 








17 


1058 


2085 


2537 


784 5743 


18 


1059 


2086 


2538 


790 167 


19 


1060 








20 


1061 


2087 


2539 


788 2001 


21 


1062 








22 


1063 


2088 


2540 


784 1683 


23 


1064 


2089 


2541 


785 I699 


24 


1065 








25 


1066 








26 


1067 


2090 


2542 


789 5434 


27 


1068 








28 


1069 


2091 


2543 


790 13996 


29 


1070 








30 


1071 








31 


1072 








32 


1073 








33 


1074 


2092 


2544 


784 6213 


34 


1075 


2093 


2545 


784 1993 


35 


1076 








36 


1077 


2094 


2546 


790 3341 


37 


1078 


2095 


2547 


79 l 5740 


38 


1079 








39 


1080 


2096 


2548 


792 4643 


40 


1081 








41 


1082 








42 


1083 








43 


1084 


2097 


2549 


790 407 


44 


1085 








45 


1086 


2098 


2550 


785 1457 


46 


1087 


2099 


2551 


790 20129 


47 


1088 








48 


1089 


2100 


2552 


790 I 8963 


49 


1090 


2101 


2553 


790 515 


50 


1091 


2102 


2554 


787 7703 
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Table 9 



SEQID 


SEQID 


SEQID 


SEQID 


Identification of 


NO: of full- 


NO: of full- 


NO: of 


NO: of 


Priority Application 


length 


length 


contig 


contig 


that contig nucleotide 


nucleotide 


peptide 


nucleotide 


peptide 


sequence was filed 


sequence 


sequence 


sequence 


sequence 


(Attorney Docket 










JNO. oJfiy JLLl JNO.) 


51 


1092 








52 


1093 








53 


1094 


2103 


2555 


HOA TT7A 

784 7239 


C A 

54 


1095 


^ 1 t\A 

2104 


2556 


taa 1 nm i 
/9U 19U31 


cc 

55 


1096 


O 1 AC 

2105 


OCCT 

2557 


7fi i i "7CA 

/91 1/jO 


56 


1097 








C"7 

57 


1098 








CO 


1099 








ca 
59 


1 1 Art 

1100 


i 1 i\£. 


o ceo 
255o 


>7nn oirtO/i 


60 


1 1 A1 

1101 








61 


1 1 AT 

1102 


1 1 AT 

2107 


1 C CA 

2559 


TOO 

/BO 3000 


^t 
62 


1 1 A^ 

1103 








c^ 
63 


1104 


T 1 AO 

2108 


2560 


TOT onu i 

7b/ 2031 


64 


1105 








65 


1106 








66 


1107 


2109 


2561 


784 2939 


67 


1108 


2110 


2562 


787__4769 


68 


1109 


2111 


2563 


792 7097 


69 


1110 


2112 


2564 


788 9897 


70 


1111 


2113 


2565 


790 29652 


71 


1112 








72 


1113 


2114 


2566 


784 4530 


73 


1114 








74 


1115 








75 


1116 


2115 


2567 


787 7560 


76 


1117 








77 


1118 








78 


1119 








79 


1120 








on 

80 


1121 








81 


1122 








82 


1123 








83 


1124 


2116 


2568 


784 1264 


84 


1125 


2117 


2569 


791 1515 


85 


1126 








86 


1127 


2118 


2570 


784 3498 


87 


1128 








on 

88 


1129 








on 

oy 


1 1 1 A 

1 13(J 








oa. 


1131 








91 


1 13z 








no 
92 


1 133 








93 


1 134 


2119 


2571 


791 1404 


94 


1135 










1 1 jO 


Z1ZU 


/z 




96 


1137 








97 


1138 


2121 


2573 


787 7852 


98 


1139 








99 


1140 


2122 


2574 


788 5026 


100 


1141 








101 


1142 


2123 


2575 


790 16594 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
I Attorney uockct 
at#» cirri in mo \ * 


102 


1143 


2124 


2576 


/yil yfj i 


103 


1144 








104 


1145 








105 


1146 i 








106 


1147 








107 


1148 


2125 


2577 


790 Holy 


108 


1149 


2126 


2578 


790 1U4U 


109 


1150 


2127 


2579 


78/ 94o 


no 


1151 








111 


1152 








112 


1153 








113 


1154 


2128 


2580 


790 lyooz 


114 


1155 








115 


1156 


2129 


2581 


788 12191 


116 


1157 


2130 


2582 


784 5727 


117 


1158 








118 


1159 


2131 


2583 


784 7669 


119 


1160 








120 


1161 


2132 


2584 


784 5053 


121 


1162 








122 


1163 








123 


1164 








124 


1165 


2133 


2585 


790 9619 


125 


1166 








126 


1167 








127 


1168 


2134 


2586 


nf\r\ 11/1/1 

790 1 144 


128 


1169 








129 


1170 








130 


1171 








131 


1172 


2135 


2587 


— j/"v/\ 4 ££f\C\ 

790 16699 


132 


1173 


2136 


2588 


790 1170 


133 


1174 








134 


1175 


2137 


2589 


790 1171 


135 


1176 








136 


1177 








137 


1178 








138 


1179 








139 


1180 


2138 


2590 


785 66 


140 


1181 


2139 


2591 


790 11744 


141 


1182 








142 


1183 








143 


1184 


2140 


2592 


784 IVZZl j 


144 


1185 


2141 


2593 


790 lzl/ 


145 


1186 


2142 


2594 




1 AC 

140 


Ho/ 








147 


1188 








148 


1189 


2143 


2595 


784 3575 


149 


1190 








150 


1191 








151 


1192 








152 


1193 


2144 


2596 


787 9817 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQBDNO.) * 


153 


1194 








154 


1195 


2145 


2597 


784 9353 


155 


1196 








156 


1197 








157 


1198 








158 


1199 


2146 


2598 


784 4306 


159 


1200 








160 


1201 








161 


1202 








162 


1203 








163 


1204 


2147 


2599 


790 23831 


164 


1205 








165 


1206 








166 


1207 








167 


1208 


2148 


2600 


790 1363 


168 


1209 


2149 


2601 


784 1344 


169 


1210 








170 


1211 








171 


1212 


2150 


2602 


787 1542 


172 


1213 








173 


1214 


2151 


2603 


785 2871 


174 


1215 


2152 


2604 


787 5391 


175 


1216 


2153 


2605 


790 27456 


176 


1217 








177 


1218 


2154 


2606 


784 1229 


178 


1219 








179 


1220 


2155 


2607 


788 1187 


180 


1221 


2156 


2608 


784 256 


181 


1222 








182 


1223 








183 


1224 


2157 


2609 


790 6023 


184 


1225 








185 


1226 


2158 


2610 


790 28512 


186 


1227 








187 


1228 








188 


1229 








189 


1230 








190 


1231 








191 


1232 








192 


1233 


2159 


2611 


790 27560 


193 


1234 


2160 


2612 


784 9678 


194 


1235 








195 


1236 


2161 


2613 


787 2238 


196 


1237 








197 


1238 


2162 




7R7 Rfl1 1 


198 


1239 








199 


1240 


2163 


2615 


784 9436 


200 


1241 


2164 


2616 


787 6897 


201 


1242 








202 


1243 








203 


1244 


2165 


2617 


790 1649 
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Table 9 



SEQ ED 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


204 


1245 








205 


1246 


2166 


2618 


790 1664 


206 


1247 


2167 


2619 


790 1671 


207 


1248 


2168 


2620 


789 4182 


208 


1249 


2169 


2621 


787 3365 


209 


1250 


2170 


2622 


790 24699 


210 


1251 








211 


1252 


2171 


2623 


790 24002 


212 


1253 








213 


1254 


2172 


2624 


790 1713 


214 


1255 








215 


1256 


2173 


2625 


790 12005 


516 


1257 








217 


1258 


2174 


2626 


787 371 


218 


1259 


2175 


2627 


788 11375 


219 


1260 


2176 


2628 


792 6253 


220 


1261 


2177 


2629 


790 20480 


221 


1262 








222 


1263 


2178 


2630 


787 8084 


223 


1264 








224 


1265 


2179 


2631 


790 1787 


225 


1266 


2180 


2632 


787 5659 


226 


1267 


2181 


2633 


790 14480 


227 


1268 


2182 


2634 


790 1801 


228 


1269 








229 


1270 


2183 


2635 


790 22521 


230 


1271 


2184 


2636 


790 3633 j 


231 


1272 








232 


1273 


2185 


2637 


787 5670 


233 


1274 


2186 


2638 


790 20482 


234 


1275 








235 


1276 


2187 


2639 


790 6685 


236 


1277 


2188 


2640 


785 2624 


237 


1278 








238 


1279 








239 


1280 


2189 


2641 


787 6797 


240 


1281 


2190 


2642 


784 5046 


241 


' 1282 








242 


1283 








243 


1284 








244 


1285 








245 


1286 








246 


1287 








247 


1288 


2191 


2643 


784 6709 


248 


1289 








249 


1290 








250 


1291 


2192 


2644 


787 3930 


251 


1292 








252 


1293 


2193 


2645 


790 2982 


253 


1294 


2194 


2646 


790 2086 


254 


1295 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


255 


1296 








256 


1297 








257 


1298 








258 


1299 


2195 


2647 


784 1280 


259 


1300 








260 


1301 


2196 


2648 


787 9953 


261 i 


1302 


2197 


2649 


790 4258 


262 


1303 


2198 


2650 


790 16925 


263 


1304 


2199 


2651 


790 1256 


264 


1305 


2200 


2652 


788 6514 


265 


1306 








266 


1307 








267 


1308 








268 


1309 








269 


1310 








270 


1311 








271 


1312 








272 


1313 


2201 


2653 


787 2484 


273 


1314 


2202 


2654 


790 2283 


274 


1315 








275 


1316 


2203 


2655 


787 2505 


276 


1317 


2204 


2656 


790 6292 


277 


1318 








278 


1319 








279 


1320 


2205 


2657 


784 2332 


280 


1321 








281 


1322 








282 


1323 


2206 


2658 


790 2410 


283 


1324 


2207 


2659 


790 6347 


284 


1325 


2208 


2660 


790 12379 


285 


1326 


2209 


2661 


790 2433 


286 


1327 


2210 


2662 


784 8177 


287 


1328 


2211 


2663 


790 2436 


288 


1329 








289 


1330 








290 


1331 








291 


1332 


2212 


2664 


790 2469 


292 


1333 


2213 


2665 


788 7 


293 


1334 


2214 


2666 


784 6493 


294 


1335 








295 


1336 








296 


1337 


2215 


2667 


790 2489 


297 


1338 








298 


1339 








299 


1340 


2216 


2668 


Ton fifing 


300 


1341 


2217 


2669 


787 2576 


301 


1342 


2218 


2670 


790 2537 


302 


1343 








303 


1344 


2219 


2671 


790 2542 


304 


1345 








305 


1346 
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Table 9 



SEQID 

NO:offulW 

length 

nucleotide 

sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


306 


1347 


2220 


2672 


784 1031 


307 


1348 








308 


1349 


2221 


2673 


787 3678 


309 


1350 








310 


1351 


2222 


2674 


787 1269 


311 


1352 


2223 


2675 


790 4055 


312 


1353 








313 


1354 








314 


1355 








315 


1356 








316 


1357 








317 


1358 


2224 


2676 


790 2683 


318 


1359 









319 


1360 








320 


1361 








321 


1362 








322 


1363 








323 


1364 








324 


1365 


2225 


2677 


784 2283 


325 


1366 


2226 


2678 


785 999 


326 


1367 








327 


1368 








328 


1369 


2227 


2679 


787 2690 


329 


1370 


2228 


2680 


787 10099 


330 


1371 








331 


1372 


2229 


2681 


787 2706 


332 


1373 


2230 


2682 


790 3751 


333 


1374 


2231 


2683 


787 9316 


334 


1375 


2232 


2684 


790 20358 


335 


1376 


2233 


2685 


784 5053 


336 


1377 








337 


1378 








338 


1379 


2234 


2686 


791 2711 


339 


1380 








340 


1381 


2235 


2687 


784 3427 


341 


1382 








342 


1383 


2236 


2688 


790 2178 


343 


1384 


2237 


2689 


790 1467 


344 


1385 








345 


1386 


2238 


2690 


784 622 l 


346 


1387 


2239 


2691 


791 3194 ! 


347 


1388 


2240 


2692 


790 2886 


348 


1389 


2241 


2693 


790 23660 


349 


1390 








350 


I39l 








351 


1392 








352 


1393 








353 


1394 








354 


1395 








355 


1396 


2242 


2694 


784 1062 


356 


1397 


2243 


2695 


784 552 
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SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) 


357 


1398 


2244 


2696 


787 2790 


358 


1399 


2245 


2697 


784 2232 


359 


1400 


2246 


2698 


785 231 


360 


1401 


2247 


2699 


790 11073 


361 


1402 


2248 


2700 


790 2954 


362 


1403 








363 


1404 








364 


1405 








365 


1406 








366 


1407 


2249 


2701 


789 6204 


367 


1408 








368 


1409 








369 


1410 








370 


1411 


2250 


2702 


787 9215 


371 


1412 


2251 


2703 


789 4399 _] 


372 


1413 


2252 


2704 


790 29004 


373 


1414 


2253 


2705 


790 3053 


374 


1415 








375 


1416 








376 


1417 








377 


1418 


2254 


2706 


787 7446 


378 


1419 








379 


1420 








380 


1421 


2255 


2707 


784 2866 


381 


1422 


2256 


2708 


790 3129 


382 


1423 








383 


1424 








384 


1425 


2257 


2709 


787 2844 


385 


1426 


2258 


2710 


790 7572 


386 


1427 


2259 


2711 


792 907 


387 


1428 


2260 


2712 


785 396 


388 


1429 








389 


1430 








390 


1431 








391 


1432 








392 


1433 








393 


1434 








394 


1435 


2261 


2713 


790 3197 


395 


1436 


2262 


2714 


790 26462 


396 


1437 








397 


1438 








398 


1439 








399 


1440 


2263 


2715 


790 3241 


400 


1441 


2264 


2716 


790 14778 


401 


1442 








402 


1443 








403 


1444 








404 


1445 


2265 


2717 


787 6238 


405 


1446 


2266 


2718 


784 2488 


406 


1447 








407 


1448 


2267 


2719 


784 9081 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
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SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 
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Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 


408 


1449 


2268 | 


2720 


784 4949 


409 


1450 








410 


1451 








411 


1452 








412 


1453 








413 


1454 








414 


1455 








415 


1456 


2269 


2721 


784 53 13 


416 


1457 








417 


1458 


2270 


2722 


784 8649 


418 


1459 








419 


1460 








420 


1461 


2271 | 


2723 


790 3503 


421 


1462 


2272 


2724 


790 10950 


422 . I 


1463 


2273 


2725 


787 1829 


423 


1464 | 


2274 


2726 


785 845 


424 


1465 








425 


1466 


2275 


2727 


787 1830 


426 


1467 


2276 


2728 


787 2166 


427 


1468 


2277 


2729 


787 918 


428 


1469 


2278 


2730 


790 2695 


429 


1470 








430 


1471 


2279 


2731 


785 406 


431 


1472 








432 


1473 


2280 


2732 


790 12656 


433 


1474 


2281 


2733 


787 2938 


434 


1475 


2282 


2734 


784 1698 


435 


1476 








436 


1477 


2283 


2735 


787 931 


437 


1478 








438 


1479 


2284 


2736 


787 5985 


439 


1480 


2285 


2737 


787 3966 


440 


1481 


2286 


2738 


790 17389 


441 


1482 


2287 


2739 


787 1371 


442 


1483 


2288 


2740 


784 2299 


443 


1484 








444 


1485 








445 


1486 


2289 


2741 


790 15495 


446 


1487 








447 


1488 


2290 


2742 


787 2985 


448 


1489 








449 


1490 


2291 


2743 


790 4868 


450 


1491 








451 


1492 








452 


1493 


2292 


Z/44 


70c /tin 


453 


1494 








454 


1495 


2293 


2745 


784 3656 


455 


1496 








456 


1497 








457 


1498 








458 


1499 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


459 


1500 


2294 


2746 


790 17074 


460 


1501 








461 


1502 








462 


1503 








463 


1504 








464 


1505 








465 


1506 


2295 


2747 


790 6796 


466 


1507 


2296 


2748 


784 8548 


467 


1508 








468 


1509 








469 


1510 


2297 


2749 


787 4134 


470 


1511 








471 


1512 








472 


1513 


2298 


2750 


785 607 


473 


1514 








474 


1515 


2299 


2751 


784 4444 


475 


1516 








476 


1517 








477 


1518 


2300 


2752 


785 609 1 


478 


1519 


2301 


2753 


787 6219 


479 


1520 


2302 


2754 


790 20198 


480 


1521 








481 


1522 


2303 


2755 


789 5808 


482 


1523 








483 


1524 


2304 


2756 


790 21362 j 


484 


1525 








485 


1526 








486 


1527 








487 


1528 


2305 


2757 


790 8539 


488 


1529 








489 


1530 


2306 


2758 


790 14555 


490 


1531 








491 


1532 








492 


1533 


2307 


2759 


790 17165 


493 


1534 


2308 


2760 


789 5563 


494 


1535 








495 


1536 








496 


1537 


2309 


2761 


788 10803 


497 


1538 


2310 


2762 


790 1392 


498 


1539 








499 


1540 








500 


1541 








501 


1542 








502 


1543 


2311 


2763 


790 26265 


503 


1544 








504 


1545 








505 


1546 








506 


1547 








507 


1548 


2312 


2764 


790 14264 


508 


1549 








509 


1550 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No._SEQIDNO.) * 


510 


1551 








511 


1552 








512 


1553 


2313 


2765 


787 419 


513 


1554 


2314 


2766 


791 2696 


514 


1555 








515 


1556 








516 


1557 


2315 


2767 


785 1450 


517 


1558 


2316 


2768 


787_4026 


518 


1559 








519 


1560 


2317 


2769 


790 12340 


520 


1561 








521 


1562 








522 


1563 


2318 


2770 


790 13247 


523 


1564 


2319 


2771 


790 10245 


524 


1565 


2320 


2772 


787 1017 


525 


1566 


2321 


2773 


790 23263 


526 


1567 


2322 


2774 


790 16427 


527 


1568 








528 


1569 


2323 


2775 


789 5186 


529 


1570 


2324 


2776 


790 30441 


530 


1571 


2325 


2777 


789 3709 


531 


1572 


2326 


2778 


790 18037 ! 


532 


1573 








533 


1574 


2327 


2779 


785 764 


534 


1575 








535 


1576 


2328 


2780 


789 5283 


536 


1577 - 


2329 


2781 


790 22045 


537 


1578 


2330 


2782 


789 2553 


538 


1579 


2331 


2783 


790 16254 


539 


1580 


2332 


2784 


785 3340 


540 


1581 


2333 


2785 


789 1599 


541 


1582 


2334 


2786 


784 2310 


542 


1583 


2335 


2787 


790 4114 


543 


1584 


2336 


2788 


790 12511 


544 


1585 








545 


1586 








546 


1587 








547 


1588 








548 


1589 


2337 


2789 


788 11639 


549 


1590 








550 


1591 








551 


1592 


2338 


2790 


790 14073 


552 


1593 








553 


1594 


2339 


2791 


790 27205 


554 


1595 








555 


1596 








556 


1597 


2340 


2792 


790 4994 


557 


1598 


2341 


2793 


790 6212 


558 


1599 


2342 


2794 


787 8231 


559 


1600 








560 


1601 
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NO: of full- 
length 
peptide 
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contig 
peptide 
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that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


561 


1602 








562 


1603 








563 


1604 








564 


1605 


2343 


2795 


789 3199 1 


565 


1606 


2344 


2796 


784 1039 


566 


1607 








567 


1608 








568 


1609 








569 


1610 






■ 


570 


1611 








571 


1612 


2345 


2797 


784 9353 


572 


1613 








573 


1614 


2346 


2798 


790 29553 


574 


1615 








575 


1616 


2347 


2799 


787 669 


576 


1617 








577 


1618 


2348 


2800 


790 4880 


578 


1619 


2349 


2801 


784 2473 


579 


1620 


2350 I 


2802 


791 3397 


580 


1621 








581 


1622 








582 


1623 


2351 


2803 


787 6211 


583 


1624 








584 


1625 








585 


1626 


2352 


2804 


790 19650 


586 


1627 








587 


1628 








588 


1629 








589 


1630 








590 


1631 








591 


1632 








592 


1633 








593 


1634 








594 


1635 








595 


1636 


2353 


2805 


788 1109 


596 


1637 


2354 


2806 


790 12340 


597 


1638 








598 


1639 








599 


1640 


2355 


2807 


790 16631 


600 


1641 


2356 


2808 


784 3763 


601 


1642 








602 


1643 








603 


1644 








604 


1645 








605 


1646 








606 


1647 








607 


1648 








608 


1649 








609 


1650 








610 


1651 








611 


1652 
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SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
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SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 
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Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


612 


1653 








613 


1654 








614 


1655 








615 


1656 








616 


1657 








617 


1658 








618 


1659 


2357 


2809 


790 24903 


619 


1660 


2358 


2810 


785 2185 


620 


1661 








621 


1662 








622 


1663 


2359 


2811 


790 20271 


623 


1664 








624 


1665 








625 


1666 








626 


1667 








627 


1668 








628 


1669 








629 


1670 


2360 


2812 


790 14778 


630 


1671 








631 


1672 








632 


1673 








633 


1674 








634 . 


1675 








635 


1676 








636 


1677 








637 


1678 








638 


1679 








639 


1680 








640 


1681 








641 


1682 


2361 


2813 


790 12348 


642 


1683 








643 


1684 








644 


1685 








645 


1686 








646 


1687 


2362 


2814 


790 667 


647 


1688 


2363 


2815 


787 4774 


648 


1689 


2364 


2816 


784 4739 1 


649 


1690 








650 


1691 


2365 


2817 


785 2741 


651 


1692 








652 


1693 








653 


1694 








654 


1695 








655 


1696 


2366 


2818 


787 10308 


656 


1697 








657 


1698 








658 


1699 


2367 


2819 


790 13971 


659 


1700 








660 


1701 








661 


1702 


2368 


2820 


790 1314 


662 


1703 


2369 


2821 


788 6944 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) 


663 


1704 


2370 


2822 


790 2750 


664 


1705 


2371 I 


2823 


787 9604 


665 


1706 


2372 


2824 


TO A 1 C A 1 

784 3541 


666 


1707 








667 


1708 


2373 


2825 


790 20829 


668 


1709 


2374 ! 


2826 


789 1765 


669 


1710 









670 


1711 








671 


1712 


2375 


2827 


784 1088 


672 


.1713 








673 


1714 








674 


1715 








675 


1716 








676 


1717 








677 


1718 








678 


1719 








679 


1720 








680 


1721 








681 


1722 








682 


1723 


2376 


2828 


791 4325 


683 


1724 








684 


1725 








685 


1726 








686 


1727 


2377 


2829 


790 17256 


687 


1728 


2378 


2830 


790 6038 


688 


1729 








689 


1730 








690 


1731 








691 


1732 


2379 


283 1 


784 1490 


692 


1733 








693 


1734 








694 


1735 








695 


1736 








696 


1737 


2380 


2832 


784 1639 


697 


1738 








698 


1739 








699 


1740 


2381 


2833 


790 3738 


700 


1741 








701 


1742 








702 


1743 








703 


1744 








704 


1745 








705 


1746 








706 


1747 








707 


1748 


2382 




7R4 4029 


708 


1749 


2383 


2835 


790 28014 


709 


1750 








710 


1751 


2384 


2836 


792 6483 


711 


1752 








712 


1753 








713 


1754 
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SEQID 
NO: of fuD- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was Gled 
(Attorney Docket 
No. SEQID NO.) * 


714 


1755 


2385 


2837 


790 15616 


715 


1756 








716 


1757 








717 


1758 








718 


1759 








719 


1760 


2386 


2838 


784 1755 


720 ' 


1761 








721 


1762 








722 


1763 








723 


1764 








724 


1765 








725 


1766 








726 


1767 








727 


1768 








728 


1769 








729 


1770 








730 


1771 








731 


1772 








732 


1773 


2387 


2839 


784 3304 


733 


1774 


2388 


2840 


785 2998 


734 


1775 








735 


1776 


2389 


2841 


790 5241 


736 


1777 


2390 


2842 


787 6489 


737 


1778 


2391 


2843 


790 29981 


738 


1779 








739 


1780 








740 


1781 








741 


1782 


2392 


2844 


790 6347 


742 


1783 


2393 


2845 


790 14685 


743 


1784 








744 


1785 








745 


1786 


2394 


2846 


787 10117 


746 


1787 








747 


1788 








748 


1789 


2395 


2847 


787 1056 


749 


1790 








750 


1791 


2396 


2848 


785 1047 


751 


1792 


2397 


2849 


791 419 


752 


1793 


2398 


2850 


787 3759 


753 


1794 








754 


1795 


2399 


2851 


785 3304 


755 


1796 








756 


1797 


2400 


2852 


784 4056 


757 


1798 








758 


1799 


2401 


2853 




759 


1800 








760 


1801 








761 


1802 








762 


1803 


2402 


2854 


787 4393 


763 


1804 








764 


1805 
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NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 


765 


1806 


2403 


2855 


784 3297 


766 


1807 








767 


1808 








768 


1809 


2404 


2856 


784 3609 


769 


1810 








770 


1811 








771 


1812 


2405 


2857 


792 6026 


772 


1813 


2406 


2858 


787 9972 


773 


1814 








774 


1815 








775 


1816 








776 


1817 








777 


1818 








778 


1819 








779 


1820 


2407 


2859 


785 1351 j 


780 


1821 








781 


1822 


2408 


2860 


791 3196 


782 


1823 


2409 


2861 


790 25408 J 


783 


1824 


2410 


2862 


784 3960 


784 


1825 


2411 


2863 


787 4591 


785 


1826 


2412 


2864 


784 4366 


786 


1827 








787 


1828 


2413 


2865 


785 3201 


788 


1829 


2414 


2866 


784 360 


789 


1830 


2415 


2867 


785 1913 


790 


1831 


2416 


2868 


789 2627 


791 


1832 








792 


1833 








793 


1834 








794 


1835 








795 


1836 








796 


1837 








797 


1838 


2417 


2869 


790 2077 


798 


1839 


2418 


2870 


790 19187 


799 


1840 


2419 


2871 


789 3760 


800 


1841 


2420 


2872 


784 6919 


801 


1842 








802 


1843 


2421 


2873 


784 1456 


803 


1844 








804 


1845 








805 


1846 


2422 


2874 


784 5322 


806 


1847 


2423 


2875 


790 1305 


807 


1848 








808 


1849 








809 


1850 








810 


1851 








811 


1852 








812 


1853 








813 


1854 








814 


1855 


2424 


2876 


790 21839 


815 


1856 
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NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 


816 


1857 








817 


1858 








818 


1859 


2425 


2877 


790 20653 


819 


1860 








820 


1861 


2426 


2878 


784 8235 


821 


1862 


2427 


2879 


792_7381 


822 


1863 








823 


1864 


2428 


2880 


784 2446 


824 


1865 


2429 


2881 


787 5610 


825 


1866 








826 


1867 








827 


1868 


2430 


2882 


787 8030 ! 


828 


1869 








829 


1870 








830 


1871. 


2431 


2883 


784 287 | 


831 


1872 


2432 


2884 


785 2857 \ 


832 


1873 








833 


1874 








834 


1875 








835 


1876 








836 


1877 


2433 


2885 


787 7849 


837 


1878 


2434 


2886 


788 4268 


838 


1879 








839 


1880 








840 


1881 








841 


1882 








842 


1883 








843 


1884 








844 


1885 


2435 


2887 


784 3976 


845 


1886 


2436 


2888 


788 13658 


846 


1887 








847 


1888 








848 


1889 


2437 


2889 


784 5652 


849 


1890 


2438 


2890 


784 6881 


850 


1891 


2439 


2891 


784 344 


851 


1892 








852 


1893 








853 


1894 








854 


1895 








855 


1896 








856 


1897 








857 


1898 








858 


1899 


2440 _j 


2892 


790 1219 


859 


1900 


2441 


2893 


790 19855 


860 


1901 








861 


1902 


2442 


2894 


784 4089 ! 


862 


1903 


2443 


2895 


787 4525 | 


863 


1904 








864 


1905 








865 


1906 


2444 


2896 


791 14 


866 


1907 
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SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 


867 1 


1908 








868 


1909 








869 


1910 


2445 


2897 


792 8447 


870 


1911 








871 


1912 








872 


1913 


2446 


2898 


790 12289 


873 


1914 








874 


1915 


2447 


2899 


791 938 


875 


1916 


2448 


2900 


787 2708 


876 


1917 


2449 


2901 


790 28624 


877 


1918 








878 


1919 








879 


1920 








880 


1921 


2450 


2902 


790 9414 


881 


1922 








882 


1923 








883 


1924 








884 


1925 


2451 


2903 


790 29172 


885 


1926 


2452 


2904 


785 1259 


886 


1927 








887 


1928 


2453 


2905 


790 11594 


888 


1929 


2454 


2906 


790 4305 


889 


1930 


2455 


2907 


792 4498 


890 


1931 








891 


1932 








892 


1933 








893 


1934 








894 


1935 








895 


1936 








896 


1937 


2456 


2908 


790 2984 


897 


1938 








898 


1939 


2457 


2909 


790 11010 


899 


1940 


2458 


2910 


790 21318 


900 


1941 


2459 


2911 


790 3969 i 


901 


1942 


2460 


2912 


785 3697 


902 


1943 


2461 


2913 


785 3750 


903 


1944 


2462 


2914 


787 10293 


904 


1945 


2463 


2915 


787 5468 


905 


1946 








906 


1947 


2464 


2916 


784 4027 


907 


1948 








908 


1949 


2465 


2917 


791 1076 


909 


1950 


2466 


2918 


790 14655 


910 


1951 








911 


1952 


2467 




/oo i i iOi 


912 


1953 


2468 


2920 


784 3554 


913 


1954 


2469 


2921 


784 6827 


914 


1955 








915 


1956 








916 


1957 








917 


1958 


2470 


2922 


789 4549 
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Table 9 



SEQID 
NO; of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


IdentiGcation of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


918 


1959 








919 


1960 


2471 


2923 


790 948 


920 


1961 








921 


1962 


2472 


2924 


789 682 


922 


1963 


2473 


2925 


787 2281 


923 


1964 








924 


1965 


2474 


2926 


790 11999 ! 


925 


1966 


2475 


2927 


790 28325 


926 


1967 


2476 


2928 


790 7793 


927 


1968 


2477 


2929 


792 3501 


928 


1969 








929 


1970 


2478 


2930 


790 4547 


930 


1971 


2479 


2931 


788 5864 


931 


1972 








932 


1973 


2480 


2932 


790 24604 


933 


1974 








934 


1975 


2481 


2933 


790 25716 


935 


1976 


2482 


2934 


785 1851 


936 


1977 


2483 


2935 


785 1852 


937 


1978 


2484 


2936 


785 1155 


938 


1979 


2485 


2937 


785 3352 


939 


1980 








940 


1981 


2486 


2938 


785 1297 


941 


1982 


2487 


2939 


785 477 


942 


1983 


2488 


2940 


785 2441 


943 


1984 


2489 


2941 


785 1294 


944 


1985 








945 


1986 








946 


1987 








947 


1988 


2490 


2942 


789 4549 


948 


1989 


2491 


2943 


784 6979 


949 


1990 


2492 


2944 


784 8567 


950 


1991 


2493 


2945 


790 14286 


951 


1992 


2494 


2946 


784 8986 


952 


1993 








953 


1994 


2495 


2947 


790 12510 


954 


1995 








955 


1996 








956 


1997 








957 


1998 


2496 


2948 


787 3623 


958 


1999 








959 


2000 








9.60 


2001 








961 


2002 


2497 


2949 


792 4842 


962 


2003 


2498 


2950 


784 9156 


963 


2004 








964 


2005 








965 


2006 








966 


2007 


2499 


2951 


784 2649 


967 


2008 


2500 


2952 


785 544 


968 


2009 


2501 


2953 


787 4148 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 


969 


2010 








970 


2011 


2502 


2954 


784 5145 


971 


2012 








972 


2013 


2503 


2955 


784 919 


973 


2014 








974 


2015 


2504 


2956 


787 2532 


975 


2016 


2505 


2957 


788 13689 


976 


2017 








977 


2018 


2506 


2958 


784 2950 


978 


2019 








979 


2020 








980 


2021 


2507 


2959 


784 4027 


981 


2022 


2508 


2960 


785 332 


982 


2023 








983 


2024 








984 


2025 


2509 


2961 


784 1944 


985 


2026 


2510 


2962 


787 6916 


986 


2027 


2511 


2963 


'787 2539 


987 


2028 








988 


2029 


2512 


2964 


787 10243 


989 


2030 








990 


2031 








991 


2032 


2513 


2965 


787 5673 


992 


2033 








993 


2034 








994 


2035 








995 


2036 








996 


2037 








997 


2038 








998 


2039 


2514 


2966 


787 2168 


999 


2040 


2515 


2967 


784 1151 


1000 


2041 








1001 


2042 








1002 


2043 


2516 


2968 


787 3680 


1003 


2044 


2517 


2969 


787 5181 


1004 


2045 


2518 


2970 


787 3356 


1005 


2046 


2519 


2971 


785 254 


1006 


2047 








1007 


2048 








1008 


2049 


2520 


2972 


789 1109 


1009 


2050 








1010 


2051 








1011 


2052 


2521 


2973 


790 7032 


1012 


2053 


2522 


2974 


791 4111 


1013 


2054 








1014 


2055 








1015 


2056 


2523 


2975 


790 11262 


1016 


2057 


2524 


2976 


787 2040 


1017 


2058 








1018 


2059 








1019 


2060 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQID NO.) * 


1020 


2061 








1021 


2062 


2525 


2977 


785 1902 


1022 


2063 


2526 


2978 


790 12167 


1023 


2064 








1024 


2065 








1025 


2066 








1026 


2067 








1027 


2068 


2527 


2979 


784 9027 


1028 


2069 


2528 


2980 


790 8294 


1029 


2070 








1030 


2071 


2529 


2981 


784 5029 


1031 


2072 


2530 


2982 


784 3541 


1032 


2073 








1033 


2074 


2531 


2983 


787 5870 


1034 


2075 








1035 


2076 


2532 


2984 


/Of Z/JJ 


1036 


2077 


2533 


2985 


785 581 


1037 


2078 


2534 


2986 


787 9345 


1038 


2079 








1039 


2080 








1040 


2081 1 






1041 


2082 1 







*784 XXX = SEQ ID NO: XXX of Attorney Docket No. 784, US Serial No. 09/488,725 
filed 01/21/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

785_XXX = SEQ ID NO: XXX of Attorney Docket No. 785, US Serial No. 09/491,404 
filed 01/25/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

787_XXX = SEQ ID NO: XXX of Attorney Docket No. 787, US Serial No. 09/496,914 
filed 02/03/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

788_XXX = SEQ ID NO: XXX of Attorney Docket No. 788, US Serial No. 09/515,126 
filed 02/28/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

789JXXX = SEQ ID NO: XXX of Attorney Docket No. 789, US Serial No. 09/519,705 
filed 03/07/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

790JCXX = SEQ ED NO: XXX of Attorney Docket No. 790, US Serial No. 09/540,217 
filed 03/31/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 
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Table 9 

791_XXX = SEQ ID NO: XXX of Attorney Docket No. 791, US Serial No. 09/552,929 
filed 04/18/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

792_XXX = SEQ ID NO: XXX of Attorney Docket No. 792, US Serial No. 09/577,408 
filed 05/18/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 



Printed from Mimosa 05/1 1/28 16:01:17 Page: 547 



WO 03/080795 



PCT/US02/25485 



547 
Table 10 



SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 






■ 


1 


1042 


1 


2 


1043 


2 


3 


1044 a 


3 


4 _j 


1045 


4 


5 


1046 


5 


6 


1047 


6 


7 


1048 


7 


8 


1049 


8 


9 


1050 


9 


10 


1051 


10 


11 


1052 


11 


12 


1053 


12 


13 


1054 j 


13 


14 


1055 


14 


15 


1056 


15 


16 


1057 i 


16 


17 


1058 


17 


18 


1059 


18 


19 


1060 


19 


20 


1061 


20 


21 


1062 


21 


22 


1063 


22 


23 


1064 


23 


24 


1065 


24 


25 


1066 


25 


26 


1067 


26 


27 


1068 


27 


28 I 


1069 


28 


29 


1070 


29 


30 


1071 


30 


31 


1072 


31 


32 


1073 


32 


33 


1074 


33 


34 


1075 


34 


35 


1076 


35 


36 


1077 


36 


37 


1078 


37 


38 


1079 


38 


39 


1080 


39 


40 


1081 


40 


41 


1082 


41 


42 


1083 


42 


43 


1084 


43 


44 


1085 


44 


45 


1086 


45 


46 


1087 


46 


47 


1088 


47 


48 


1089 


48 


49 


1090 


49 


50 


1091 


50 


51 


1092 


51 


52 


1093 


52 
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Table 10 



SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 


53 


1094 


53 


54 


1095 


54 


55 


1096 


55 


56 


1097 


56 


57 


1098 


57 


58 


1099 


58 i 


59 


1100 


59 


60 


1101 


60 


61 


1102 


61 


62 


1103 


62 


63 


1104 


63 


64 


1105 


64 


65 


1106 


65 


66 


1107 


66 


67 


1108 


67 


68 


1109 


68 


69 


1110 


69 


70 


1111 


70 


71 


1112 J 


71 


72 


1113 


72 


73 


1114 


73 


74 


1115 


74 


75 


1116 


75 


76 


1117 


76 


77 


1118 


77 


78 


1119 


78 


79 


1120 


79 


80 


1121 


80 


81 


1122 


81 


82 


1123 


82 


83 


1124 


83 _j 


84 


1125 


84 


85 . 


1126 


85 


86 


1127 


86 


87 


1128 


87 


88 


1129 


88 


89 


1130 


89 


90 


1131 


90 


91 


1132 


91 1 


92 


1133 


92 


93 


1134 


93 


94 


1135 


94 


95 


1136 


95 


96 


1137 


96 


97 


1138 


97 


98 


1139 




99 


1140 


99 


100 


1141 


100 


101 


1142 


101 


102 


1143 


102 


103 


1144 


103 


104 


1145 


104 


105 


1146 


105 
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Table 10 



SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 


106 


1147 


106 


107 


1148 


107 


108 


1149 


108 


109 


1150 


109 ! 


110 


1151 


110 


111 


1152 


111 


112 


1153 


112 


113 


1154 


113 


114 


1155 


114 


115 


1156 


115 


116 


1157 


116 


117 


1158 


117 


118 


1159 


118 


119 


1160 


119 


120 


1161 


120 


121 


1162 


121 


122 


1163 


122 


123 


1164 


123 


124 


1165 


124 


125 


1166 


125 


126 


1167 


126 


127 


1168 


127 


128 


1169 


128 


129 


1170 


129 


130 


1171 


130 


131 


1172 


131 


132 


1173 


132 


133 


1174 


133 


134 


1175 


134 


135 


1176 


135 


136 


1177 


136 


137 


1178 


137 


138 


1179 


138 


139 


1180 


139 


140 


1181 


140 


141 


1182 


141 


142 


1183 


142 


143 


1184 


143 


144 


1185 


144 


145 


1186 


145 


146 


1187 


146 


147 


1188 


147 


148 


1189 


148 


149 


1190 


149 


150 


1191 


150 


151 


1192 




152 


1193 


152 


153 


1194 


153 


154 


1195 


154 


155 


1196 


155 


156 


1197 


156 


157 


1198 


157 


158 


1199 


158 J 
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Table 10 



SEQ ED NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/31 1,261 


159 


1200 


159 1 


160 


1201 


160 


161 


1202 


161 I 


162 


1203 


162 


163 


1204 


163 


164 


1205 


164 


165 i 


1206 


165 


166 


1207 


166 


167 


1208 


167 


168 


1209 


168 


169 


1210 


169 


170 


1211 


170 


171 


1212 


171 


172 


1213 


172 


173 


1214 


173 


174 


1215 


174 


175 


1216 


175 


176 


1217 


176 


177 


1218 


177 


178 

I/O 


1219 ! 


178 


179 

1/7 


1220 


179 


180 


1221 


180 


181 


1222 


181 I 


182 


1223 


182 


183 


1224 


183 


184 


1225 


184 


185 


1226 


185 


186 


1227 


186 


187 


1228 


187 


188 


1229 


188 1 


189 


1230 


189 


190 


1231 


190 


191 


1232 


191 1 


192 


1233 


192 


193 


1234 


193 


194 


1235 


194 


195 


1236 


195 


196 


1237 


196 


197 


1238 


197 


198 


1239 


198 


199 


1240 


199 n 


200 


1241 


200 


201 


1242 


201 


202 


1243 


202 


203 


1244 


203 


204 


1245 


204. 


205 


1246 


205 


206 


1247 


206 


207 


1248 


207 


208 


1249 


208 


209 


1250 


209 


210 


1251 


210 


211 


1252 


211 
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Table 10 



SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 


212 


1253 


212 


213 


1254 


213 


214 


1255 


214 


215 


1256 


215 


216 


1257 


216 


217 


1258 


217 


218 


1259 


218 


219 


1260 


219 


220 


1261 


220 


221 


1262 


221 


222 


1263 


222 


223 


1264 


223 


224 


1265 


224 


225 


1266 


225 


226 


1267 


226 


227 


1268 


227 


228 


1269 


228 


229 


1270 


229 


230 


1271 _J 


230 


231 


1272 


231 


232 


1273 


232 


233 


1274 


233 


234 


1275 


234 


235 


1276 j 


235 


236 


1277 


236 


237 


1278 


237 


238 


1279 


238 


239 


1280 


239 


240 


1281 


240 


241 


1282 


241 


242 


1283 


242 


243 


1284 


243 


244 


1285 


244 


245 


1286 


245 


246 


1287 


246 


247 


1288 


247 


248 


1289 


248 


249 


1290 


249 


250 


1291 


250 


251 


1292 


251 


252 


1293 


252 


253 


1294 


253 


254 


1295 


254 


255 


1296 


255 


256 


1297 


256 


257 


1298 


257 


258 


1299 


258 


259 


1300 


259 j 


260 


1301 


260 


261 


1302 


261 | 


262 


1303 


262 


263 


1304 


263 


264 


1305 


264 ~~ | 
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fflCO ID NO of Full-length 
TVurlentide Seauence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 


265 


1306 


265 


266 


1307 


266 


267 


1308 


267 


268 


1309 


268 


269 


1310 


269 


270 


1311 


270 


271 


1312 


271 


272 


1313 


272 


273 


1314 


273 


274 


1315 


274 


275 


1316 


275 


276 


1317 


276 


277 


1318 


277 


278 


1319 


278 


279 


1320 


279 


280 


1321 


280 


281 


1322 


281 


9R9 


1323 


282 


9R** 


1324 


283 


9R4 


1325 


284 


9R*5 


1326 


285 


9R6 
ZOO 


1327 


286 


9R7 


1328 


287 


9RR 


1329 


288 


289 


1330 


289 


290 


1331 


290 


291 


1332 


291 


292 


1333 


292 


293 


1334 


293 


294 


1335 


294 


295 


1336 


295 


296 


1337 


296 


297 


1338 


297 


298 


1339 


298 


299 


1340 


299 


300 


1341 


300 


301 


1342 


301 


302 


1343 


302 


303 


1344 


303 


304 


1345 


304 


305 


1346 


305 


306 


1347 


306 j 


307 


1348 


307 


308 


1349 


308 


309 


1350 


309 


310 


1351 


310 


311 


1352 


311 


312 


1353 


312 


313 


1354 


313 


314 


1355 


314 


315 


1356 


315 


316 


1357 


316 


317 


1358 


317 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 607311,261 


318 


1359 


318 


319 


1360 


319 


320 


1361 


320 


321 


1362 


321 


322 


1363 


322 ! 


323 1 


1364 


323 


324 


1365 


324 


325 


1366 | 


325 


326 


1367 


326 


327 


1368 


327 


328 


1369 


328 


329 


1370 


329 


330 


1371 


330 


331 


1372 


331 


332 


1373 


332 


333 


1374 


333 


334 


1375 


334 


335 


1376 


335 


336 


1377 


336 


337 


1378 


337 


338 


1379 


338 


339 


1380 


339 


340 


1381 


340 


341 


•1382 


341 


342 


1383 


342 


343 


1384 


343 


344 


1385 


344 


345 


1386 


345 


346 


1387 


346 


347 


1388 


347 


348 


1389 


348 


349 


1390 


349 


350 


1391 


350 


351 


1392 


351 


352 


1393 


352 


353 


1394 


353 


354 


1395 


354 | 


355 


1396 


355 


356 


1397 


356 


357 


1398 


357 


358 


1399 


358 


359 


1400 


359 


360 


1401 


360 


361 


1402 


361 


362 


1403 


362 


363 


1404 


363 


364 


1405 


364 


365 


1406 


365 i 


366 


1407 


366 


367 


1408 


367 


368 


1409 


368 


369 


1410 


369 _j 


370 


1411 


370 
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371 


1412 


371 


372 


1413 


372 


373 


1414 


373 


374 


1415 


374 


375 


1416 


375 


376 


1417 


376 


377 


1418 I 


377 


378 


1419 


378 


379 


1420 


379 


380 


1421 


380 


381 


1422 


381 


382 


1423 


382 


383 


1424 


383 


384 


1425 


384 


385 


1426 


385 


386 


1427 


386 


387 


1428 


387 


388 


1429 


388 


389 


1430 


389 


390 


1431 


390 


391 


1432 


391 


392 


1433 


392 


393 


1434 


393 


394 


1435 


394 


395 


1436 


395 


396 


1437 


396 


397 


1438 


397 


398 


1439 


398 


399 


1440 


399 


400 


1441 


400 


401 


1442 


401 


402 


1443 


402 


403 


1444 


403 ! 


404 


1445 


404 j 


405 


1446 


405 


406 


1447 


406 


407 


1448 


407 


408 


1449 


408 


409 


1450 


409 


410 


1451 


410 


411 


1452 


411 1 


412 


1453 


412 


413 


1454 


413 


414 


1455 


414 


415 


1456 


415 


416 


1457 


416 


417 


1458 


417 


418 


1459 


418 


419 


1460 


419 


420 


1461 


420 


421 


1462 


421 


422 


1463 


422 


423 


1464 


423 
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424 


1465 


424 


425 - 


1466 


425 


426 


1467 


426 


427 


1468 


427 


428 


1469 


428 


429 


1470 


429 


430 


1471 


430 


431 


1472 


431 j 


432 


1473 


432 


433 


1474 


433 


434 


1475 


434 


435 1 


1476 


435 


436 


1477 


436 


437 


1478 


437 


438 


1479 


438 


439 


1480 


439 


440 


1481 


440 


441 


1482 


441 


442 


1483 


442 1 


443 


1484 


443 


444 


1485 


444 


445 


1486 


445 


446 


1487 


446 


447 


1488 


447 J 


448 


1489 


448 


449 


1490 


449 


450 


1491 


450 


451 


1492 


451 


452 


1493 • 


452 


453 


1494 


453 


454 


1495 


454 


455 


1496 


455 


456 


1497 


456 


457 


1498 


457 


458 


1499 


458 


459 


1500 


459 


460 


1501 


460 


461 


1502 


461 


462 


1503 


462 


463 


1504 


463 


464 


1505 


464 


465 


1506 


465 _J 


466 


1507 


466 


467 


1508 


467 


468 


1509 


468 


469 


1510 


469 


470 


1511 


470 


471 


1512 


471 \ 


472 


1513 


472 


473 


1514 


473 


474 


1515 


474 


475 


1516 


475 


476 


1517 


476 
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477 


1518 | 


477 


478 


1519 _J 


478 


479 


1520 


479 


480 


1521 


480 


481 


1522 


481 


482 


1523 


482 


483 


1524 


483 


484 


1525 


484 


485 


1526 


485 


486 


1527 


486 


487 


1528 


487 


488 


1529 


488 


489 


1530 


489 


490 


1531 J 


490 


491 


1532 


491 


492 


1533 


492 


493 


1534 


493 


494 


1535 


494 


495 


1536 


495 


496 


1537 


496 


497 


1538 


497 


498 


1539 


498 


499 


1540 


499 


500 


1541 


500 


501 


1542 


501 


502 


1543 


502 


503 


1544 


503 


504 


1545 


504 


505 


1546 


505 


506 


1547 


506 J 


507 


1548 


507 


508 


1549 


508 


509 


1550 


509 


510 


1551 


510 


511 


1552 


511 


512 


1553 


512 


513 


1554 


513 


514 


1555 


514 


515 


1556 


515 


516 


1557 


516 


517 


1558 


517 


518 


1559 


518 


519 


1560 


519 


520 


1561 


520 


521 


1562 


521 


522 


1563 


522 


523 


1564 


523 


524 


1565 


524 


525 


1566 


525 


526 


1567 


527 


527 


1568 


528 


528 


1569 


529 


529 


1570 


530 
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D3U 


1571 


531 


531 


1572 


532 


532 


1573 


533 


533 


1574 


534 


534 


1575 


535 


cor 

535 


1576 


536 


536 


1577 


537 


53/ 


1578 


538 


535 


1579 


539 


on 


1580 


540 


CA A 

541) 


1581 


541 


541 


1582 


542 


542 


1583 


543 


543 


1584 


544 . 


544 


1 


545 


545 


I JOv 


546 


546 


1 ^R7 
1 JO / 


547 J 


547 


1 ^RR 


548 


548 


1 <\RQ 


549 


549 




550 


550 


1 ^01 


551 


551 




552 


552 


1 ^Q^ 


553 


553 


1 5Qd 


554 


554 


1 

1 Jf J 


555 


555 




556 


556 


1 SQ7 

1J7 / 


557 


557 




558 


558 




559 


559 


1600 


560 


560 


1fi01 

lUUl 


561 


561 


1602 


562 j 


562 


1603 


563 


5 03 


1604 


564 


564 


1605 


565 


565 


1606 


566 


566 


1607 


567 


567 


1608 


568 


ceo 

56a 




569 


569 


1610 


570 


570 


1611 


571 


571 


1612 


572 


572 


1613 


573 


573 


1614 


574 


574 


1615 


575 


575 


1616 


576 i 


576 


1617 


577 ! 


577 


1618 


J /o 


578 


1619 


579 _j 


579 


1620 


580 


580 


1621 


581 


581 


1622 


582 


582 


1623 


583 
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583 


1624 


584 J 


584 


1625 


585 


585 1 


1626 


586 


586 i 


1627 


587 


587 j 


1628 


588 


588 


1629 


589 


589 


1630 J 


590 


590 


1631 


591 


591 


1632 


592 


592 


1633 


593 ! 


593 


1634 


594 


594 


1635 


595 


595 


1636 


596 j 


596 


1637 


597 


597 


1638 


598 


598 


1639 


599 


599 


1640 


600 


600 


1641 


601 


601 


1642 


602 


602 


1643 


603 


603 


1644 


604 


604 


1645 


605 


605 


1646 


606 


606 


1647 


607 


607 


1648 


608 


608 


1649 


609 


609 


1650 


610 


610 


1651 


611 


611 


1652 


612 


612 


1653 


613 


613 


1654 


614 1 


614 


1655 


615. 


615 


1656 


616 


616 


1657 


617 


617 


1658 


618 


618 


1659 


619 


619 


1660 


620 


620 


1661 


621 


621 


1662 


622 


622 


1663 


623 


623 


1664 


624 


624 


1665 


625 


625 


1666 


626 


626 


1667 


627 


627 


1668 


628 


628 


1669 


629 


629 


1670 


630 


630 


1671 


631 


631 


1672 


632 


632 


1673 


633 


633 


1674 


634 


634 


1675 


635 


635 


1676 


636 
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636 


1677 


637 


637 


1678 


638 


638 


1679 


639 


639 


1680 


640 ! 


640 


1681 


641 


641 


1682 


642 


642 


1683 


643 j 


643 


1684 


644 


644 


1685 


645 


645 


1686 | 


646 


646 


1687 I 


647 


647 


1688 


648 


648 


1689 


649 


649 


1690 


650 | 


650 


1691 


651 


651 


1692 


652 


652 


1693 


653 


653 


1694 


654 


654 


1695 


655 


655 


1696 


656 _j 


656 


1697 


657 


657 


1698 


658 


658 


1699 


I 659 


659 


1700 


660 


660 


1701 1 


661 


661 


1702 


662 


662 


1703 


663 


663 


1704 


664 


664 


1705 


665 


665 


1706 


666 


666 


1707 


667 


667 


1708 


668 


668 


1709 


669 


669 


1710 


670 


670 


1711 


671 


671 


1712 


672 


672 


1713 


673 


673 


1714 


674 


674 


1715 


675 


675 


1716 


676 


676 


1717 


677 


677 


1718 


678 


678 


1719 


679 


679 


1720 


680 


680 


1721 


681 


681 


1722 




682 


1723 


683 


683 


1724 


684 1 


684 


1725 


685 f 


685 


1726 


686 J 


686 


1727 


687 


687 


1728 


688 


688 


1729 


689 
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689 


1730 


690 


690 


1731 


691 


691 


1732 


692 


692 


1733 


693 


693 


1734 


694 


694 


1735 


695 ' 


695 


1736 


696 


696 


1737 


697 ! 


697 


1738 


698 ! 


698 


1739 


699 


699 


1740 


700 


700 


1741 


701 ! 


701 


1742 


702 


702 


1743 


703 


703 


1744 


704 


704 


1745 


705 


705 


1746 


706 


706 


1747 


707 


707 J 


1748 


708 


708 


1749 


709 


709 


1750 


710 


710 


1751 


711 


711 


1752 


712 ! 


712 


1753 


713 


713 


1754 


714 


714 


1755 


715 


715 


1756 


716 


716 


1757 


717 


717 


1758 


718 


718 


1759 


719 


719 


1760 


720 


720 


1761 


721 


721 


1762 


722 | 


722 


1763 


723 


723 


1764 


724 


724 


1765 


725 


725 


1766 


726 


726 


1767 


727 


727 


1768 


728 


728 


1769 


729 


729 


1770 


730 


730 


1771 


731 


731 


1772 


732 


732 


1773 


733 


733 


1774 


734 


734 


1775 


I5j 


735 


1776 


736 


736 


1777 


737 


737 


1778 


738 


738 


1779 


739 


739 


1780 


740 


740 


1781 


741 


741 


1782 


742 
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742 


1783 


743 


743 


1784 


744 


744 


1785 t 


745 


745 


1786 


746 


746 


1787 


747 


747 


1788 


748 


748 


1789 


749 


749 


1790 


750 


750 


1791 


751 


751 1 


1792 


752 


752 


1793 


753 


753 j 


1794 


754 


754 


1795 


755 


755 


1796 


756 


756 


1797 


757 


757 


1798 


758 


758 


1799 


759 


759 


1800 


760 


760 


1801 


761 J 


761 


1802 


762 


762 


1803 


763 


763 


1804 


764 j 


764 


1805 


765 


765 


1806 


766 


766 


1807 


767 


767 


1808 


768 


768 


1809 


769 


769 


1810 


770 


770 


1811 


771 


771 


1812 


772 


772 


1813 


773 


773 


1814 


774 


774 


1815 


775 


775 


1816 


776 


776 


1817 


777 


777 


1818 


778 


778 


1819 


779 


779 


1820 


780 


780 


1821 


781 


781 


1822 


782 


782 


1823 


783 


783 


1824 . 


784 


784 


1825 


785 


785 


1826 


786 


786 


1827 


787 


787 


1828 


too 


788 


1829 


789 


789 


1830 


790 


790 


1831 


791 J 


791 


1832 


792 


792 


1833 


793 


793 


1834 


794 


794 


1835 


795 
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795 


1836 


796 


796 


1837 j 


797 


797 


1838 1 


798 


798 


1839 


799 


799 


1840 


800 


800 


1841 


801 


801 


1842 


802 


802 


1843 


803 


803 


1844 


804 


804 


1845 


805 


805 


1846 


806 


806 


1847 


807 


807 


1848 


808 


808 


1849 


809 


809 


1850 


810 


810 


1851 


811 


811 


1852 


812 


812 


1853 


813 


813 


1854 


814 


814 


1855 


815 


815 


1856 


816 


816 


1857 


817 


817 


1858 


818 


818 


1859 


819 


819 


1860 


820 


820 


1861 


821 


821 


1862 


822 


822 


1863 


823 


823 


1864 


824 


824 


1865 


825 


825 


1866 


826 


826 


1867 


827 


827 


1868 


828 


828 


1869 


829 1 


829 


1870 


830 


830 


1871 


831 


831 


1872 


832 


832 


1873 


833 


833 


1874 


834 


834 


1875 


835 


835 


1876 


836 


836 


1877 


837 


837 


1878 


838 


838 


1879 


839 


839 


1880 


840 


840 


1881 


841 


841 


1882 


842 


842 


1883 


843 


843 


1884 


844 


844 


1885 


845 


845 


1886 


846 


846 


1887 


847 


(847 


1888 


848 
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848 


1889 


849 


849 


1890 


850 


850 


1891 t 


851 


851 


1892 


852 


852 


1893 


853 


853 


1894 


854 


854 


1895 


855 j 


855 


1896 


856 


856 


1897 


857 


857 I 


1898 


858 


858 


1899 


859 


859 


1900 


860 


860 


1901 


861 


861 


1902 


862 


862 


1903 


863 


863 


1904 


864 


864 


1905 


865 


865 


1906 J 


866 


866 


1907 


867 


867 


1908 


868 


868 


1909 


869 


869 


1910 


870 


870 


1911 


871 


871 


1912 


872 


872 


1913 


873 


873 


1914 


874 


874 


1915 


875 


875 


1916 


876 


876 


1917 


877 


877 


1918 


878 


878 


1919 


879 


879 


1920 


880 


880 


1921 


881 


881 


1922 


882 


882 


1923 


883 


883 


1924 


884 


884 


1925 


885 


885 


1926 


886 


886 


1927 


887 


887 


1928 


888 


888 


1929 


889 


889 


1930 


890 


890 


1931 


891 


891 


1932 


892 


892 


1933 


893 


893 


1934 




894 


1935 


895 


895 


1936 


896 


896 


1937 


897 


897 


1938 


898 


898 


1939 


899 


899 


1940 


900 


900 


1941 


901 _ 
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901 


1942 


902 


902 


1943 


903 


903 


1944 1 


904 


904 


1945 


905 


905 


1946 


906 


906 


1947 


907 


907 1 


1948 


908 


908 


1949 


909 


909 


1950 


910 


910 


1951 


911 


911 


1952 


912 


912 


1953 


913 


913 


1954 


914 


914 


1955 


915 


915 


1956 


916 


916 


1957 


917 


917 


1958 


918 


918 


1959 


919 


919 


1960 


920 


920 


1961 


921 


921 


1962 


922 


922 


1963 


923 


923 


1964 


924 


924 


1965 


925 


925 


1966 


926 


926 


1967 


927 


927 


1968 


928 


928 


1969 


929 


929 


1970 


930 


930 


1971 


931 


931 


1972 


932 


932 


1973 


933 


933 


1974 


934 


934 


1975 


935 


935 


1976 


936 


936 


1977 


937 


937 


1978 


938 


938 


1979 


939 


939 


1980 


940 


940 


1981 


941 


941 


1982 


942 


942 


1983 


943 


943 


1984 


944 ! 


944 


1985 


945 


945 


1986 


946 


946 


1987 




947 


1988 


948 


948 


1989 


949 


949 


1990 


950 


950 


1991 


951 


951 


1992 


952 


952 


1993 


953 


953 


1994 


954 
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954 


1995 


955 


955 


1996 


956 


956 


1997 


957 


957 


1998 


958 


958 


1999 


959 


959 


2000 


960 


960 


2001 


961 


961 


2002 


962 


962 


2003 


963 


963 


2004 


964 


964 


2005 


965 


965 


2006 


966 


966 


2007 


967 


967 


2008 


968 


968 


2009 


969 


969 


2010 


970 


970 


2011 


971 


971 


2012 J 


972 


972 


2013 


973 


973 


2014 


974 


974 


2015 


975 


975 


2016 


976 


976 


2017 


977 


977 


2018 


978 


978 


2019 


979 


979 


2020 


980 


980 


2021 


981 


981 


2022 


982 


982 


2023 


983 


983 


2024 


984 


984 


2025 


985 


985 


2026 


986 


986 


2027 


987 


987 


2028 


988 


988 


2029 


989 


989 


2030 


990 


990 


2031 


991 


991 


2032 


992 


992 


2033 


993 


993 


2034 


994 


994 


2035 


995 


995 


2036 


996 


996 


2037 


997 


997 


2038 


998 


998 


2039 


999 


999 


2040 


1000 


1000 


2041 


1001 


1001 


2042 


1002 


1002 


2043 


1003 


1003 


2044 


1004 


1004 


2045 


1005 


1005 


2046 


1006 


1006 


2047 


1007 
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SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
•Priority Application 
USSN 60/311,261 


1007 


2048 


1008 


1008 ' 


2049 


1009 


1009 


2050 


1010 


1010 


2051 


1011 


1011 


2052 


1012 


1012 


2053 


1013 


1013 


2054 


1014 


1014 


2055 


1015 


1015 


2056 


1016 ; 


1016 


2057 


1017 


1017 


2058 


1018 


1018 1 


2059 


1019 


1019 


2060 


1020 


1020 


2061 


1021 


1021 


2062 


1022 


1022 


2063 


1023 


1023 


2064 


1024 ; 


1024 


2065 


1025 


1025 


2066 


1026 


1026 


2067 


1027 


1027 


2068 


1028 


1028 


2069 


1029 


1029 


2070 


1030 


1030 


2071 


1031 


1031 


2072 


1032 


1032 


2073 


1033 


1033 


2074 


1034 


1034 


2075 


1035 


1035 


2076 


1036 


1036 


2077 


1037 


1037 


2078 


1038 


1038 


2079 


1039 


1039 


2080 


1040 


1040 


2081 


1041 : 


1041 


2082 


1042 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting ofSEQ ID NO: 1-1041. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein 
said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein 
said polynucleotide has greater than about 99% sequence identity with the polynucleotide of 
claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting 
of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; 
and 

(b) a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-1041 . 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 
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a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex 
is detected, a compound that binds to the polypeptide of claim 10 is identified. 



18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, 
under conditions sufficient to form a polypeptide/compound complex, wherein the complex 
drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, 
so that if the polypeptide/compound complex is detected, a compound that binds to the 
polypeptide of claim 10 is identified. 

19. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of any of the polynucleotides from SEQ ID NO: 1-1041, under 
conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ED NO: 1042-2082. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising of at least one of 
SEQ ID NO: 1-1041. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects frill-matches to any one of the 
polynucleotides in the collection. 
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25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 
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